\documentclass[10pt,a4paper,twoside]{article}
\usepackage[polutonikogreek,english]{babel}
%The following makes *everything* Greek!
%\usepackage{greek}

\renewcommand{\thetable}{\Roman{table}}

\usepackage{amsthm}
\swapnumbers

\input{../../math/abbreviations}
%\newcommand{\pref}[1]{(\ref{#1})}

\theoremstyle{definition}
\newtheorem{exercise}[theorem]{Exercise}

\theoremstyle{plain}
\newtheorem{Peano}[theorem]{Peano Axioms}
\newtheorem{Infinity}[theorem]{Axiom of Infinity}
\newtheorem{Choice}[theorem]{Axiom of Choice}

\input{../../math/format}

\newcommand{\todaysdate}{2004.2.20}

\title{Notes on Set-Theory}
\author{David Pierce}
\date{\todaysdate}
%\pagestyle{headings}

 \pagestyle{myheadings}
 \markboth{NOTES ON SET-THEORY}{\leftmark}

\newcommand{\sectbegin}{\S\ \thesection\quad }

%    \usepackage{fancyheadings} \pagestyle{fancy}
%    \lhead[\thepage]{\S~\thesection} \rhead[\S~\thesection]{\thepage}
%    \chead[]{}    \lfoot[]{} \rfoot[]{} \cfoot[]{}


% \addtolength{\voffset}{-2cm}
% \addtolength{\textheight}{4cm}

\newcommand{\Tur}[1]{\texttt{#1}} % for words in Turkish
\newcommand{\Eng}[1]{\textsl{#1}} %for words in English
%\newcommand{\Gk}[1]{\textsf{#1}}
\newcommand{\Lat}[1]{\textsc{#1}}
\newcommand{\Gk}[1]{\begin{greektext}{#1}\end{greektext}}
\newcommand{\lett}[1]{\textsf{#1}}

\renewcommand{\theenumi}{\fnsymbol{enumi}}
\renewcommand{\labelenumi}{\textnormal{(\theenumi)}}

\renewcommand{\theequation}{\fnsymbol{equation}}

\newcommand{\axz}{Axiom Z}
\newcommand{\axu}{Axiom U}
\newcommand{\axi}{Axiom I}


\begin{document}
 \setcounter{section}{-1}
 \maketitle\thispagestyle{empty}

\section{Introduction}\label{sect:intro} 
\markright{\sectbegin Introduction}
\subsection{}

The book of Landau \cite{MR12:397m} that influences these notes
begins with two prefaces, one for the student and one for the
teacher.  The first asks the student not to read the second.
Perhaps Landau hoped to \emph{induce} the student to read the Preface
for the Teacher, but not to worry about digesting its contents.
I have such a hope concerning \S~\ref{subsect:teacher} below.

An earlier version of these notes\footnote{I prepared the earlier
  version for
  the first-year course at METU called `Fundamentals of Mathematics'
  (Math 111); but those notes contained much more than that course had
  time for.} began immediately with a study of the natural numbers.
  The set-theory in those notes was somewhat \emph{na\"\i ve}, that
  is, non-axiomatic.  Of the usual so-called Zermelo--Fraenkel
Axioms with Choice, the notes \emph{did} mention the Axioms of
Foundation, Infinity and Choice, but not (explicitly) the others.  
The present notes \emph{do} give all of the axioms\footnote{In their
  order of appearance here, they are: Extensionality
  (p.~\pageref{ax:extensionality}), Pairing (p.~\pageref{ax:pairing}),
  Comprehension (p.~\pageref{ax:comprehension}), Power-set
  (p.~\pageref{ax:power-set}), Union (p.~\pageref{ax:union}),
  Replacement (p.~\pageref{ax:replacement}), Infinity
  (p.~\pageref{ax:choice}) and Foundation (p.~\pageref{ax:foundation}).}
  of $\zfc$.

What is a set?
First of all, a set is many things that can be considered as
one; it is a multitude
that is also a unity; it is something like a \tech{number}\footnote{I
  may set technical terms in a slanted font thus, by way of
  acknowledging that they \emph{are} technical terms.}.
Therefore, set-theory might be an 
appropriate part of the education of the guardians of an ideal
city---namely, the city that
Plato's Socrates describes in the \emph{Republic}.
The following translation from Book VII (524d--525b) is mine, but
depends on the translations of Shorey \cite{Shorey} and Waterfield
\cite{Waterfield}.  I have inserted some of the original Greek words,
especially\footnote{I have also included certain derivatives of the
  present participle \Gk{>'ont-} corresponding to the English
  \Eng{being}.  Addition of the abstract-noun suffix \Gk{-'ia} yields
  \Gk{o>us'ia}; the corresponding Turkish might be \Tur{olurluk}.  The
  Greek \Gk{o>us'ia} is sometimes translated as \Eng{substance}, and
  indeed both words can connote wealth.
  Putting the definite article in front of the nominative neuter form
  of \Gk{>'ont-} creates \Gk{t`o >'on}.}
those that are origins of English words.  (See Table \ref{table:Greek}
below for transliterations.)

\begin{table}[b]
\caption{The Greek alphabet}\label{table:Greek}
  \begin{center}
      \begin{tabular}{| c  l | c  l | c l | c l |} \hline
\Gk{A a} & \textbf alpha  & \Gk{H h} & \textbf{\=e}ta   & \Gk{N n} & \textbf nu
& \Gk{T t} & \textbf tau \\ 
\Gk{B b} & \textbf beta   & \Gk{J j} & \textbf{th}eta & \Gk{X x} & \textbf xi
& \Gk{U u} & \textbf upsilon \\ 
\Gk{G g} & \textbf gamma  & \Gk{I i} & \textbf iota  & \Gk{O o} & \textbf
omicron& \Gk{F f} & \textbf{ph}i\\ 
\Gk{D d} & \textbf delta  & \Gk{K k} & \textbf kappa & \Gk{P p} & \textbf pi
& \Gk{Q q} & \textbf{ch}i\\ 
\Gk{E e} & \textbf epsilon& \Gk{L l} & \textbf lambda& \Gk{R r} & \textbf{r}ho
& \Gk{Y y} & \textbf {ps}i\\ 
\Gk{Z z} & \textbf zeta   & \Gk{M m} & \textbf mu    & \Gk{S sv/s} & \textbf
sigma  & \Gk{W w} & \textbf{\=o}mega\\ \hline 
  \end{tabular}
  \end{center}
The first letter or two of the (Latin) name provides a transliteration
for the Greek letter.  In texts, the rough-breathing mark \Gk{<} over
an initial
vowel (or \Gk r) corresponds to a preceeding \lett h; the
smooth-breathing mark \Gk{>} and the three tonal accents can be
ignored. 
\end{table}

\begin{quotation}
`So this is what I [Socrates] was just trying to explain:  Some things
  are thought-provoking, and some are not.  Those things are called
  \defn{thought-provoking} that strike our sense together with their
  opposites.  Those that do not, do not
tend to awaken reflection.'

`Ah, now I understand' he [Glaucon] said.  `It seems that way to me,
too.'

`Okay then. Which of these do \emph{multiplicity}
  (\Gk{>arijm'oc}) and \emph{unity} (\Gk{t`o <'en}) seem to be?' 

`I can't imagine' he said. 

`Well,' I said `reason it out from
what we said.  If unity is fully grasped alone, in itself,  by sight
or some other sense, then it must be [an object] like a finger, as we
were explaining: it does not draw us towards \emph{being-ness}
(\Gk{o>us'ia}).  But if some discrepancy is always seen with it, so as
to appear not rather \emph{one} (\Gk{<'en}) than its opposite, then a
decision is needed---indeed, the \emph{soul} (\Gk{yuq'h}) in itself is
compelled to be puzzled, and to cast about, arousing thought within
itself, and to ask:
What then is unity as such?  And so the \emph{study}
  (\Gk{m'ajhsic}) of unity must be among those that lead and guide
  [the soul] to the sight of \emph{that which is} (\Gk{t`o >'on}).'

 `But certainly' he
  said `vision is especially like that.  For, the same thing is seen
  as one and as \emph{indefinite multitude}
  (\Gk{>'apeira t`o pl\~hjoc}).'

`If it is so with unity,' I said `is it not so with every  \emph{number}
  (\Gk{>arijm'oc})?'

`How could it not be?'

`But \emph{calculation}
  (\Gk{logistik'h}) and \emph{number-theory} (\Gk{>arijmhtik'h}) are
entirely about number.'

`Absolutely.'

`And these things appear to lead to truth.'

`Yes, and extremely well.'

`So it seems that these must be some of the \emph{studies}
  (\Gk{majhm'ata}) that
  we are looking for.  Indeed, the \emph{military} (\Gk{polemik'on}) needs to
  learn them for deployment [of troops],---and the philosopher,
  because he has to rise out of [the world of] \emph{becoming}
  (\Gk{g'enesic}) in order to take hold of being-ness, or else he will
  never \emph{become a calculator} (\Gk{logistik\~w| gen'esjai}).'

`Just so' he said.

`And our guardian happens to be both military man and philosopher.' 

`Of course.'

`So, Glaucon, it is appropriate to require this study by law and to
  persuade those who intend to take part in the greatest affairs of
  the city to go into calculation and to engage in it not \emph{as a pastime}
  (\Gk{>idiwtik\~wc}), but until they have attained, by thought
  itself, the vision of the nature of numbers, not [for the sake of]
  buying and selling, as if they were preparing to be merchants or
  shopkeepers, but for the sake of war and an easy turning of the soul
  itself from becoming towards truth and being-ness.'

`You speak superbly' he said.
\end{quotation}
(In reading this passage from Plato, and in particular the comments on
war, one can hardly be sure
that Socrates is not pulling Glaucon's leg.  Socrates previously
(369b--372c) described a primitive, peaceful,
vegetarian city, which Glaucon rejected (372c--d) as
being fit only for pigs.)

The reader of the present notes is not assumed to have much knowledge
`officially'.  But the reader should have some awareness of the
Boolean connectives of propositional logic and their connexion with
the Boolean operations on sets.  (A dictionary of the connectives is
in Table \ref{table:connectives} below.)

\begin{table}[b]
\caption{Boolean connectives}\label{table:connectives}
  \begin{center}
  \begin{tabular}{| c | l | l |}\hline
$\land$ & \Eng{and} & conjunction\\ \hline
$\lor$ & \Eng{or} & disjunction\\ \hline
$\lnot$ & \Eng{not} & negation \\ \hline
$\to$ & \Eng{implies} & implication\\ \hline
$\iff$ & \Eng{if and only if} & biconditional\\ \hline
  \end{tabular}
\end{center}
\end{table}
One theme of these notes is the relation between \tech{definition by
  recursion} and \tech{proof by induction}.  The development of
  propositional logic already requires recursion and
  induction.\footnote{Here I use the words  `recursion' and
  `induction' in a more general sense than in the definitions on
  pp.~\pageref{page:induction} and \pageref{page:recursion}.}    For
  example, \defn{propositional formulas}\footnote{Words in bold-face
  in these notes are being defined.} are defined recursively:
  \begin{enumerate}
    \item
\tech{Propositional variables} and $0$ and $1$ are propositional formulas.
\item
If $F$ is a propositional formula, then so is $\lnot F$.
\item
If $F$ and $G$ are propositional formulas, then so is
$(F\Bcon G)$, where $\Bcon$ is $\land$, $\lor$, $\to$ or $\iff$.
  \end{enumerate}
The \defn{sub-formulas} of a formula are also defined recursively:
\begin{enumerate}
  \item
Every formula is a sub-formula of itself.
\item
Any sub-formula of a formula $F$ is a sub-formula of $\lnot F$.
\item
Any sub-formula of $F$ or $G$ is a sub-formula of $(F\Bcon G)$.
\end{enumerate}
Now, two formulas are \defn{equivalent} if they have the same
\tech{truth-table}.  For example, $(P\to Q)$ and $(\lnot P\lor Q)$ are
equivalent, because their truth-tables are, respectively:
\begin{center}
  \begin{tabular}{c|c|c}
$P$ & $\to$ & $Q$ \\ \hline
$0$ & $1$ & $0$\\
$1$ & $0$ & $0$\\
$0$ & $1$ & $1$\\
$1$ & $1$ & $1$
  \end{tabular}\qquad
  \begin{tabular}{c|c|c|c}
    $\lnot$ & $P$ & $\lor$ & $Q$\\ \hline
$1$&$0$&$1$&$0$\\
$0$&$1$&$0$&$0$\\
$1$&$0$&$1$&$1$\\
$0$&$1$&$1$&$1$
  \end{tabular}
\end{center}
Suppose $F$ and $G$ are equivalent; this is denoted
\begin{equation*}
  F\sim G.
\end{equation*}
Suppose also $F$ is a sub-formula of $H$, and $H'$ is the result of
replacing $F$ in $H$ with $G$.  The \defn{Substitution Theorem} is
that
\begin{equation*}
  H\sim H'.
\end{equation*}
Because of the recursive definition of propositional formulas, we can
prove the Substitution Theorem by induction as follows:
\begin{enumerate}
  \item
The claim is trivially true when $H$ is a
propositional variable or $0$ or $1$, since then $F$ \emph{is} $H$, so
$H'$ is $G$.  
\item
Suppose, as an inductive hypothesis, that the 
claim is true when $H$ is $H_0$.  Then we can
show that the claim is true when $H$ is $\lnot H_0$.
\item
Suppose, as an inductive hypothesis, that the 
claim is true when $H$ is $H_0$ and when $H$ is $H_1$.  Then we can
show that the claim is true when $H$ is $(H_0\Bcon H_1)$, where
$\Bcon$ is as above. 
\end{enumerate}
Such a proof is sometimes said to be a `proof by induction on the
  complexity of propositional formulas'.

A conjunction corresponds to an
\tech{intersection} of sets, and so forth, but this is spelled out in
\S~\ref{sect:sets} below.  I shall also use formulas of
\tech{first-order} logic, and in particular the \tech{quantifiers}
(given in Table \ref{table:quantifiers} below).
\begin{table}[b]
\caption{Quantifiers}\label{table:quantifiers}
  \begin{center}
  \begin{tabular}{| c | l | l |}\hline
$\forall$ & \Eng{for all} & universal\\ \hline
$\exists$ & \Eng{there exists\dots such that} & existential\\ \hline
  \end{tabular}
\end{center}
\end{table}
For emphasis, instead of $\to$ and $\iff$,
I may use the arrows $\implies$ and $\Iff$ between formulas.

My own research-interests lie more in
model-theory than in set-theory.  I aim here just to
set down some established mathematics as precisely as
possible, without much discussion.  (There is a textbook that has been in use
for over two thousand years, but that contains no discussion at all, only
axioms, definitions, theorems and proofs.  This is Euclid's
\emph{Elements}~\cite{MR17:814b}.)  I do think that explicit reference
to models can elucidate some points.  The reader should
also consult texts by people who \emph{are} set-theorists, for other
points of view, for historical references, and to see how the field
has developed beyond what is
given in these notes.  Also, the reader should remember that these
notes are still a rough draft.  There are not yet many exercises, and
some of them are difficult or lacking in clear answers.

\subsection{}\label{subsect:teacher}
Any text on axiomatic set-theory will introduce the set $\varN$, which
is the
smallest set that contains $\emptyset$ and that contains $x\cup\{x\}$
whenever it contains $x$.  The text \emph{may} (but need not) mention
that $\varN$ is a
model of the Peano axioms for the natural numbers.  The present notes
differ from some published texts in two ways:
\begin{itemize}
  \item
I prove facts about the natural numbers \emph{from the Peano
  axioms}, not just \emph{in $\varN$}.
\item
I mention structures that are models of some, but not all, of the
Peano axioms.
\end{itemize}

I set out a minimum of set-theory in \S~\ref{sect:sets}, enough so
that the properties of natural numbers can be derived from the Peano
axioms, starting in \S~\ref{sect:Peano}.  Some set-theory books, such
as Ciesielski \cite[\S~3.1]{MR99c:04001},
will immediately give $\varN$ as a model of these axioms.
Certain properties of natural numbers are
easier to prove \emph{in this model} than \emph{by the Peano axioms}.
I prefer to follow the axiomatic approach for several
reasons:

One reason is practice.  It is worthwhile to have experience with
the Peano axioms as well as $\zfc$, especially since, unlike $\zfc$,
the Peano axioms include a second-order statement.  (It may be that
some writers assume that the reader has already had
sufficient practice with the Peano axioms; I do not make such an
assumption.) 

The Peano axioms are more natural than their specific model $\varN$.
The elements of $\varN$ (as well as $\varN$ itself) are so-called
\tech{von Neumann ordinals}, that is, \tech{transitive} sets 
that are \tech{well-ordered} by \tech{containment}. 
In a slightly different
context, the model-theorist Poizat \cite[\S~8.1]{MR2001a:03072} observes:
\begin{quote}
  We meet some students who are allergic to ordinals as
  `well-ordering types' and who find the notion of von Neumann
  ordinals easier to digest; that is a singular consequence of
  dogmatic teaching, which confuses formalism with rigor, and which
  favors technical craft to the detriment of the fundamental idea: It
  takes a strangely warped mind to find the notion of a transitive set
  natural! 
\end{quote}

A third reason for taking the axiomatic approach to the natural
numbers is that it can bring out a distinction that is often ignored.  The
structure of the natural numbers admits \tech{proof by induction} and
\tech{definition by recursion}.   Vaught \cite[ch.~2,
  \S~4]{MR95k:03001}, for example, says that
recursion \emph{is} `the same thing as definition by induction'.
Since it is just about terminology, the statement is not wrong.  But
definition by `induction' or recursion\footnote{In the sense defined
  on p.~\pageref{page:recursion} below.} works \emph{only} in models of
the Peano axioms, while there are other structures in which
\emph{proof} by induction\footnote{In the sense defined on
  p.~\pageref{page:induction}.} works.

There are `strong' versions of induction and recursion.  There is
proof by strong induction, and definition by strong recursion.  
Admission of either of \emph{these} is equivalent to admission of the
other; the structures that admit them are precisely the well-ordered
sets.  Some basic undergraduate texts suggest confusion on this point.
For example, in talking about the integers, one book\footnote{Namely,
  Epp \cite[\S~4.4, p.~213]{Epp}, used sometimes in Math 111 and 112.}
says:  
\begin{quote}
It is apparent that if the principle of strong mathematical induction
  is true, then so is the principle of ordinary mathematical
  induction\dots 
  It can also be shown that if the principle of ordinary mathematical
  induction is true, then so is the principle of strong mathematical
  induction.  A proof of this fact is sketched in the exercises\dots
\end{quote}
Both statements about induction here are literally false.
The second statement is correct if it is understood to
mean simply that the natural numbers satisfy the principle of strong
induction.  The `proof' that is
offered for the first statement uses implicitly
that every integer is a \tech{successor}, something that does not
follow from strong induction.  

Finally, by emphasizing the axiomatic development of the natural
numbers, I hope to encourage the reader to watch out for unexamined
assumptions, in these notes and elsewhere.  The Hajnal text
\cite{MR2000m:03001} defines
$\varN$ on the first page of \S~1 as `the set of nonnegative
integers'.  Then come a hundred pages of the set-theory covered in
the present notes, and more.  The Preface says that this work `is
carried out on a quite precise, but intuitive level'; only after
\emph{this} does the reader get, in an appendix, on p.~127, a
rigorous definition of $\varN$.  To my mind, 
the precise but intuitive way to treat the natural numbers is by means
of the Peano axioms.  Perhaps the reader of Hajnal is supposed to have
seen such a treatment before, since, according to the index, the term
`Peano' appears only once, on p.~133, and there is no definition.

Devlin \cite{MR94e:03001} seems never to mention the natural numbers
as such at all, though early on (p.~6), he asserts the existence of sets
$\{a_1,\dots,a_n\}$.  (Later he defines the symbol $\varN$, na\"\i
vely on p.~24, rigorously on p.~66.)  Like Hajnal, Moschovakis
\cite{MR95a:04001}
\emph{names} the set of natural numbers on the first page of text; but
then he discusses set-theory for only fifty pages before devoting a
chapter to a rigorous treatment of the natural numbers.

\section{Sets and classes}\label{sect:sets}
\markright{\sectbegin Sets and classes}
A set has \defn{members}, or
\defn{elements}.  A set \defn{contains} its elements, and the elements
\defn{compose} the set.  To say that a set $A$ has an element $b$, we
may write
\begin{equation*}
  b\in A,
\end{equation*}
using between $b$ and $A$ a symbol derived from the Greek minuscule
epsilon, which can be understood as standing for the Latin word
\Lat{elementvm}.  A set is not \emph{distinct} from its
elements in the way that a box is distinct from its contents.  A set
may be distinct from any \emph{particular} element.  But I propose to
say that a set \emph{is} its elements, and the elements \emph{are} the
set.

This is a paradoxical statement.  How can one thing be many, and many,
one?  The
difficulty of answering this is perhaps reflected in the difficulties
of set-theory.  In any case, if a set is its elements, then the
elements \emph{uniquely determine} the set.  This is something whose
meaning we can express mathematically; it is perhaps the most
fundamental axiom of set-theory:

\begin{axiom}[Extensionality]\label{ax:extensionality}
  If two sets $A$ and $B$ have the same members, then $A=B$.
\end{axiom}
The converse of this axiom is trivially true:  If two sets have
different members, then of course the sets themselves are different.

A set is also the sort of thing that can \emph{be} an element:  If $A$ and
$B$ are sets, then the statement $A\in B$ is meaningful, and the
statement 
\begin{equation*}
  A\in B \lor A\notin B
\end{equation*}
is true.

Are all elements sets themselves?  We do not answer this question; we
avoid it:

\begin{definition}
  A property $P$ of sets is \defn{hereditary}, provided that, if $A$
  is a set with property $P$, then all elements of $A$ are \emph{sets}
  with property $P$.  A \emph{set} is \defn{hereditary} if it has a
  hereditary property.
\end{definition}

We shall ultimately restrict our attention to hereditary
sets.\footnote{See also Kunen
\cite[ch.~1, \S~4]{MR85e:03003} for discussion of this point.}   Now,
we shall not assert, as an axiom, that all sets are hereditary.  We
cannot now formulate such an axiom precisely, since we do not yet have a
definition of a `property' of sets.  The \emph{language} with which we
talk about sets will end up ensuring that our sets are hereditary:

Everything that we shall say about sets can be said with the symbol
$\in$, along with $=$ and the logical symbols given in Tables
\ref{table:connectives} and \ref{table:quantifiers} of
\S~\ref{sect:intro}, and with variables and names \emph{for
individual sets}.

\begin{definition}\label{defn:formula}
The \defn{$\in$-formulas} are 
recursively defined\footnote{This definition uses also brackets
  (parentheses) in the formulas, but the brackets do not carry meaning
  in the way that the
  other symbols do.  The brackets are meaningful in the way that the
  \emph{order} of the symbols in a formula is meaningful.  Indeed, we
  could dispense with the brackets by using the so-called Polish or \L
  ukasiewicz notation, writing, say, $\mathord{\land}\phi\psi$ instead
  of $\phi\land\psi$.  I shall use the infix notation instead,
  but shall omit brackets where they are not needed.} as follows:
\begin{enumerate}
  \item
If $x$ and $y$ are variables, and $A$ and $B$ are names, then
$x\bigcirc y$, $x\bigcirc A$, $A\bigcirc x$ and $A\bigcirc B$ are
\defn{atomic} $\in$-formulas, where $\bigcirc$ is $\in$ or $=$.
\item
If $\phi$ and $\psi$ are $\in$-formulas, then so are $\lnot\phi$ and
$(\phi\Bcon\psi)$, where $\Bcon$ is one of $\land$, $\lor$, $\to$ and $\iff$.
\item
If $\phi$ is an $\in$-formula and $x$ is a variable, then $(\Quant
x\phi)$ is an $\in$-formula, where $\quant$ is $\forall$ or $\exists$.
\end{enumerate}
\end{definition}
The $\in$-formulas are the \tech{first-order} formulas in the
\tech{signature} consisting of $\in$ alone.  Other signatures are
discussed later (see Definition \ref{defn:arb-formula}).  In any
signature, the first-order formulas are 
defined recursively as $\in$-formulas are, but the atomic formulas
will be different.  In a first-order formula, only variables can follow
quantifiers; otherwise, the distinction between a variable and a name
is not always clear (see Exercise \ref{exer:var-name}).  Also, in a
first-order formula, variables and 
names refer only to individual objects, rather than, say, sets of
objects.  In set-theory, our objects \emph{are} sets, so it would not
make much sense to have more than one kind of variable.\footnote{See also the
comments of Levy \cite[ch.~1, \S~1, p.~4]{MR80k:04001}.}

Variables and names in $\in$-formulas are also called \defn{terms}.
(In other signatures, there will be a more general definition of
\tech{term}.)  Names used in formulas may be called \defn{parameters}.

In an $\in$-formula, instead of a sub-formula 
$\mathrel{\lnot} x\in y$, we can write
\begin{equation*}
  x\notin y;
\end{equation*}
and instead of $\mathrel{\lnot} x=y$, we can write
\begin{equation*}
  x\neq y;
\end{equation*}
here, $x$ and $y$ are terms.

If a first-order formula contains no quantifiers, 
then its variables are 
\defn{free} variables.  The free variables of $\Exists x\phi$ and
$\Forall x\phi$ are those of $\phi$, \emph{except} $x$.  The free
variables of $\lnot \phi$ are just those of $\phi$.  Finally, the free
variables of $\phi\Bcon\psi$ (where $\Bcon$ is one of $\land$, $\lor$, $\to$
and $\iff$) are those of $\phi$ or $\psi$.
A \defn{sentence} is a formula with no free variables.  

We can now attempt to write the Extensionality Axiom as the sentence
\begin{equation}\label{eqn:extensionality}
  \Forall x\Forall y(\Forall z(z\in x\iff z\in y)\to x=y).
\end{equation}
Now, if the variables $x$, $y$ and $z$ can refer to any sets at all,
and if some sets contain objects that are not sets, then
\eqref{eqn:extensionality} is actually stronger than Axiom
\ref{ax:extensionality}.  Indeed, if $A$ is a set, and $b$ is an
object that is not a set, then there might be a set $\{A,b\}$
containing $A$ and $b$ and nothing else, and a set $\{A\}$ containing
$A$ and nothing else.  Then for all \emph{sets} $z$, we have
\begin{equation*}
  z\in\{A,b\}\Iff z\in \{A\}.
\end{equation*}
From this, \eqref{eqn:extensionality} seems to imply $\{A,b\}=\{A\}$,
which is evidently false.  Our solution to this problem will be to restrict
the variables in formulas like \eqref{eqn:extensionality} to
\emph{hereditary} sets.  In this way,  \eqref{eqn:extensionality}
becomes merely a special case of the Extensionality Axiom.  In
particular, since the set $\{A,b\}$ is not hereditary,
\eqref{eqn:extensionality} says nothing about it.

\begin{exercise}
  Alternatively, we might let our variables range over all (mathematical)
objects, even if some of these might not be sets.  If $a$ is not a
set, then we should require $\Forall x x\notin a$.  In this case, if
\eqref{eqn:extensionality} is still true, prove that there is at most 
one object that is not a set.
\end{exercise}

In the
`Platonic' view of set-theory, when the logical symbols in an
$\in$-sentence are interpreted as in Tables \ref{table:connectives}
and \ref{table:quantifiers} of \S~\ref{sect:intro}, and when terms are
understood to refer to hereditary sets, then
the sentence is either true or
false.  (A `relative' notion of truth is given in Definition
\ref{defn:truth}.)   Then we are looking for the true $\in$-sentences; in
particular, we 
are looking for some `obviously' true sentences---\tech{axioms}---from which
all other true sentences about hereditary sets follow
logically.\footnote{This project must fail.  By G\"odel's
  Incompleteness Theorem, we cannot define a list of axioms from
  which all truths of set-theory follow.  We can still hope to
  identify axioms from which \emph{some} interesting truths follow.
  One purpose of these notes is to develop some of these
  interesting truths.}

A first-order formula $\phi$ with at most one free variable is called
\defn{unary}; if that free variable is $x$, then the formula
might be written
\begin{equation*}
  \phi(x).
\end{equation*}
If this is an $\in$-formula, it expresses a \defn{property} that sets
might have.  If $A$ has that property, then we can assert
\begin{equation*}
  \phi(A).
\end{equation*}
Formally, we obtain the sentence $\phi(A)$ from $\phi$ by replacing
each \tech{free occurrence} of $x$ with $A$.  A precise recursive definition
of $\phi(A)$ is possible.  Here it is, for thoroughness, although we
shall not spend time with it:

\begin{definition}
For any first-order formula $\phi$, variable
$x$ and term $t$, the formula
\begin{equation*}
  \phi_t^x
\end{equation*}
is the result of \tech{freely} replacing each free occurrence of $x$ in
 $\phi$ with $t$; it is determined recursively as follows:
\begin{enumerate}
  \item
If $\phi$ is atomic, then $\phi_t^x$ is the result of replacing
\emph{each} instance of $x$ in $\phi$ with $t$.
\item
$(\lnot\phi)_t^x$ is $\lnot(\phi_t^x)$, and $(\phi\Bcon\psi)_t^x$ is
  $\phi_t^x\Bcon\psi_t^x$. 
\item
$(\Quant x\phi)_t^x$ is $\Quant x\phi$.
\item
If $y$ is not $x$ and does not appear in $t$, then $(\Quant
y\phi)_t^x$ is $\Quant y\phi_t^x$.
\item
If $y$ is not $x$, but $y$ does appear in $t$, then $(\Quant
y\phi)_t^x$ is $\Quant 
z(\phi_z^y)_t^x$, where $z$ is a variable that does not appear in
$t$ or $\phi$.
\end{enumerate}
If $\phi$ is $\phi(x)$, then  $\phi_t^x$
can be denoted
\begin{equation*}
  \phi(t).
\end{equation*}
\end{definition}
The
point is that if, for example, $\phi$ is $\psi(x)\land\Exists
x\chi(x)$, then $\phi(A)$ is $\psi(A)\land\Exists x\chi(x)$.
Alternatively, $\phi$ might be $\Exists y(\psi(x)\land\chi(y))$, in
which case $\phi(y)$ is $\Exists z(\psi(y)\land\chi(z))$, not $\Exists
y(\phi(y)\land\chi(y))$. 
  
The sets with the property given by $\phi(x)$ compose a
\defn{class}, denoted
\begin{equation}%\label{eqn:class}
  \{x: \phi(x)\}.
\end{equation}
This is the class of sets that \defn{satisfy} $\phi$, the class of $x$
such that $\phi(x)$.  But not every class is a set; not every class
is a unity to the extent that it can be considered as a member of
sets:

\begin{theorem}[Russell Paradox]
The class $\{x:x\notin x\}$ is not a set.
\end{theorem}

\begin{proof}
Suppose $A$ is a set such that
\begin{equation}\label{eqn:Russell}
  x\in A\implies x\notin x
\end{equation}
for all sets $x$.  Either $A\notin A$ or $A\in A$, but in the
latter case, by \eqref{eqn:Russell}, we still have $A\notin A$.
Therefore $A$ is a member of $\{x:x\notin x\}$, but not of $A$ itself; so
\begin{equation*}
  A\neq\{x:x\notin x\}.
\end{equation*}
Hence $\{x:x\notin x\}$ cannot be a set.
\end{proof}

\begin{remark}
  The Russell Paradox is often established by contradiction:  If the
  class $\{x:x\notin x\}$ is a set $A$, then both $A\in A$ and $A\notin A$,
  which is absurd.  However, the proof given above shows that a false
  assumption is not needed.
\end{remark}

\begin{exercise}
The sets that we are considering compose the class $\{x:x=x\}$.  It is
logically true that
\begin{equation*}
  \Forall x\Forall y(y\in x\to y=y).
\end{equation*}
Explain how this is a proof that all of our sets are hereditary.
\end{exercise}

Not every class is a set; but every set $A$ is the class $\{x:x\in
A\}$.  A
\tech{disjunction} of formulas gives us the \defn{union} of
corresponding classes:
\begin{equation*}
  \{x:\phi(x)\lor\psi(x)\}=\{x:\phi(x)\}\cup\{x:\psi(x)\}.
\end{equation*}
Likewise, a \tech{conjunction} gives an \defn{intersection}:
\begin{equation*}
  \{x:\phi(x)\land\psi(x)\}=\{x:\phi(x)\}\cap\{x:\psi(x)\};
\end{equation*}
and a \tech{negation} gives a \defn{complement}:
\begin{equation*}
  \{x:\lnot\phi(x)\}=\{x:\phi(x)\}\comp.
\end{equation*}
Finally, we can form a \defn{difference} of classes, not corresponding
to a single Boolean connective from our list:
\begin{equation*}
  \{x:\phi(x)\land\lnot\psi(x)\}=\{x:\phi(x)\}\setminus\{x:\psi(x)\}. 
\end{equation*}
If $\Forall x(\phi(x)\to\psi(x))$, then $\{x:\phi(x)\}$ is a
\defn{sub-class} of $\{x:\psi(x)\}$.  We can write
\begin{equation*}
  \Forall
  x(\phi(x)\to\psi(x))\Iff\{x:\phi(x)\}\included\{x:\psi(x)\}. 
\end{equation*}
We now have several abbreviations to use in writing $\in$-formulas:
\begin{align*}
  x\in A\cup B&\Iff x\in A\lor x\in B;\\
x\in A\cap B&\Iff x\in A\land x\in B;\\
x\in A\setminus B&\Iff x\in A \land x\notin B;\\
A\included B&\Iff \Forall x(x\in A\to x\in B).
\end{align*}

If sets exist at all, then any two sets ought to be members of some
set:

\begin{axiom}[Pairing]\label{ax:pairing}
For any two sets, there is a set that contains them:
\begin{equation*}
  \Forall x\Forall y\Exists z(x\in z\land y\in z).
\end{equation*}
\end{axiom}

The set given by the axiom might have elements other than those two
sets; we can cast them out by means of: 

\begin{axiom}[Comprehension]\label{ax:comprehension}
  A sub-class of a set is a set: For any $\in$-formula $\phi(x)$,
  \begin{equation*}
    \Forall x\Exists y\Forall z(z\in y\iff z\in x\land \phi(z)).
  \end{equation*}
\end{axiom}

Note that this axiom is not a single $\in$-sentence, but a
\tech{scheme} of $\in$-sentences.

A sub-class of a set $A$ can now be called a \defn{subset} of
$A$.  A set \defn{includes} its subsets.  A subset $B$ of $A$ that is
distinct from $A$ is a \defn{proper} subset of $A$, and we may then write
\begin{equation*}
  B\pincluded A.
\end{equation*}

We now have that, for any $x$ and $y$, there is a set
\begin{equation*}
  \{x,y\}
\end{equation*}
whose members are \emph{just} $x$ and $y$; if $x=y$, then this set is
\begin{equation*}
  \{x\},
\end{equation*}
 which is sometimes called a \defn{singleton}.

\begin{exercise}
  Prove that the class of all sets is not a set.
\end{exercise}

If $A$ is a set and $\phi$ is a (unary) formula, then the set
$A\cap\{x:\phi(x)\}$ can be written
\begin{equation*}
  \{x\in A:\phi(x)\}.
\end{equation*}
In particular, if $B$ is also a set, then
\begin{equation*}
  A\cap B=\{x\in A:x\in B\}.
\end{equation*}
As long as \defn{some} set $A$ exists,
  we have the \defn{empty set}, 
  \begin{equation*}
      \emptyset,
  \end{equation*}
which can be defined as $\{x\in A:x\neq
  x\}$.  Does some set exist?   I take this as a logical axiom:
  \begin{equation}\label{eqn:existence}
    \Exists x x=x.
  \end{equation}
Indeed, \emph{something} exists, as we might argue along with
Descartes \cite[II, \P~3]{Descartes:Med}:
\begin{quote}
  Therefore I will suppose that all I see is false\dots But certainly
  I should exist, if I were to persuade myself of something\dots Thus
  it must be granted that, after weighing everything carefully and
  sufficiently, one must come to the considered judgment that the
  statement `\emph{I am, I exist} (\Lat{ego svm, ego existo})' is
  necessarily true every time it is 
  uttered by me or conceived in my mind \cite[p.~17]{Cress}.
\end{quote}
Of course, we are claiming that \emph{hereditary sets} exist.  But I take
\eqref{eqn:existence} to be implicit in the assertion of any sentence,
such as \eqref{eqn:extensionality}. 

For any $x$ and $y$, the \defn{ordered pair} $(x,y)$ is the set
\begin{equation*}
  \{\{x\},\{x,y\}\}.
\end{equation*}
All that we require of this definition is that it allow us to prove the
following:

\begin{theorem}
  $(x,y)=(u,v)\Iff x=u\land y=v$.
\end{theorem}

\begin{exercise}
  Prove the theorem.
\end{exercise}

Given two classes $\class C$ and $\class D$, we can now form their
\defn{cartesian product}:
\begin{equation*}
  \class C\times\class D=\{(x,y):x\in\class C\land y\in\class D\}.
\end{equation*}

\begin{lemma}
  A cartesian product of classes is a well-defined class, that is, can
  be written as $\{x:\phi(x)\}$ for some $\in$-formula $\phi$.
\end{lemma}
\begin{exercise}
  Prove the lemma.
\end{exercise}
To prove that the cartesian product of \emph{sets} is a set, we can
use:
\begin{axiom}[Power-set]\label{ax:power-set}
If $A$ is a set, then there is a set $B$ such that
\begin{equation*}
  x\included A\implies x\in B
\end{equation*}
for all sets $x$.  That is,
\begin{equation*}
  \Forall x\Exists y\Forall z(\Forall w(w\in z\to w\in x)\to z\in y).
\end{equation*}
\end{axiom}

Hence, for any set $A$, its \defn{power-set}
$\{x:x\included A\}$
is a set; this is denoted 
\begin{equation*}
  \pow A.
\end{equation*}
  In particular,
$(x,y)\in\pow{\pow{\{x,y\}}}$, so
$A\times B\included\pow{\pow{A\cup B}}$.

If $A$ is a set, its \defn{union} is
$\{x:\Exists y(y\in A\land x\in y)\}$,
denoted 
\begin{equation*}
  \bigcup A.
\end{equation*}
In particular, for any sets $A$ and $B$,
\begin{equation*}
  A\cup B=\bigcup\{A,B\}.
\end{equation*}

\begin{exercise}
  What are $\bigcup\emptyset$ and $\bigcup\{\emptyset\}$?
\end{exercise}

\begin{axiom}[Union]\label{ax:union}
  The union of a set is a set:
  \begin{equation*}
    \Forall x\Exists y\Forall z(\Exists w(z\in w\land w\in x)\to z\in
    y). 
  \end{equation*}
\end{axiom}

The union of a set $A$ might be denoted also
\begin{equation*}
  \bigcup_{x\in A}x.
\end{equation*}
Suppose that, for each $x$ in $A$, there is a set $B_x$.  We shall
soon be able to define a union
\begin{equation}\label{eqn:indexed-union}
  \bigcup_{x\in A}B_x.
\end{equation}
This will be the union of $\{B_x:x\in A\}$.  But for now, we don't
even know that this thing is a well-defined \emph{class}, much less a
set. 

\begin{theorem}
  The cartesian product of sets is a set.
\end{theorem}

\begin{exercise}
  Prove the theorem.
\end{exercise}

If $A$ is a set, its \defn{intersection} is
$\{x:\Forall y(y\in A\to x\in y)\}$,
denoted 
\begin{equation*}
  \bigcap A.
\end{equation*}
  If $A$ contains a set $B$ (that is, if $B\in A$), then $\bigcap
  A\included B$, so $\bigcap A$ is a
set.  Also, for any sets $A$ and $B$,
\begin{equation*}
  A\cap B=\bigcap\{A,B\}.
\end{equation*}

\begin{exercise}
  What are $\bigcap\emptyset$ and $\bigcap\{\emptyset\}$?
\end{exercise}

A \defn{relation} between $A$ and $B$ is a subset of $A \times B$.  If
$R\included A\times B$, then
\begin{equation*}
  R\inv =\{(y,x):(x,y)\in R\},
\end{equation*}
a relation between $B$ and $A$.  If also $S\included B\times C$, then
\begin{equation*}
  S\circ R=\{(x,z):\Exists y((x,y)\in R\land (y,z)\in S)\},
\end{equation*}
a relation between $A$ and $C$.

A relation between $A$ and itself is a \defn{binary} relation on $A$.
The set
\begin{equation*}
  \{(x,x):x\in A\}
\end{equation*}
is the \defn{diagonal} $\Delta_A$ on $A$.  A binary
relation $R$ on $A$ is:
\begin{itemize}
  \item
\defn{reflexive}, if $\Delta_A\included R$;
\item
\defn{irreflexive}, if $\Delta_A\cap R=\emptyset$;
\item
\defn{symmetric}, if $R\inv=R$;
\item
\defn{anti-symmetric}, if $R\cap R\inv \included \Delta_A$;
\item
\defn{transitive}, if $R\circ R\included R$.
\end{itemize}
Then $R$ is:
\begin{itemize}
\item
an \defn{equivalence-relation}, if it is reflexive, symmetric and
transitive; 
  \item
a \defn{partial ordering}, if it is anti-symmetric and transitive and
either reflexive or irreflexive;
\item
a \defn{strict} partial ordering, if it is an irreflexive partial
ordering;
\item
a \defn{total ordering}, if it is a partial ordering and
$R\cup\Delta_A\cup R\inv=A\times A$. 
\end{itemize}
A relation $f$ between $A$ and $B$ is a \defn{function}, or
\defn{map}, from $A$ to $B$ if
\begin{equation*}
  f\circ f\inv\included\Delta_B\land \Delta_A\included f\inv\circ f. 
\end{equation*}
Suppose $f$ is thus.  We may refer to the function $f:A\to B$.  For
each $x$ in $A$, there is a unique element $f(x)$ of $B$ such that
$(x,f(x))\in f$.  Here $f(x)$ is the \defn{value} of $f$ at $x$.  We
may refer to $f$ as 
\begin{equation*}
  x\longmapsto f(x):A\To B.
\end{equation*}
The \defn{domain} of $f$ is $A$, and $f$ is a function \defn{on} $A$.
The \defn{range} of $f$ is the set $\{y\in B:\Exists x(x,y)\in f\}$,
that is, $\{f(x):x\in A\}$, which is denoted 
\begin{equation*}\label{eqn:setim}
  f\setim A.
\end{equation*}
If
$C\included A$, then $f\cap(C\times B)$ is a function on $C$, denoted
$f\rest C$ and having range $f\setim C$.  This set is also the
\defn{image} of $C$ under $f$.

The function $f:A\to B$ is:
\begin{itemize}
  \item
\defn{surjective} or \defn{onto}, if $\Delta_B\included f\circ f\inv$;
\item
\defn{injective} or \defn{one-to-one}, if $f\inv\circ f\included\Delta_A$;
\item
\defn{bijective}, if surjective and injective (that is, one-to-one and
onto). 
\end{itemize}

All of the foregoing definitions involving relations make sense even
if $A$ and $B$ are merely classes.  

  To discuss functions in the most
general sense, it is convenient to introduce a new quantifier,
\begin{equation*}
  \existsunique,
\end{equation*}
read `there exists a unique\dots such that'; this
quantifier is defined by
\begin{equation*}
  \Existsunique x\phi(x)\Iff \Exists x\phi(x)\land\Forall y(\phi(y)\to
  y=x). 
\end{equation*}
A formula $\psi$ with free variables $x$ and $y$ at most can be
written
\begin{equation*}
  \psi(x,y);
\end{equation*}
it is a \defn{binary} formula.
Then a function is a class
   $\{(x,y):\psi(x,y)\}$
such that
\begin{equation*}
  \Forall x(\Exists y\psi(x,y)\to\Existsunique y\psi(x,y)). 
\end{equation*}
The domain of this function is $\{x:\Exists y\psi(x,y)\}$.  If the
function itself is called $f$, and if its domain includes a set $A$, then
the image $f\setim A$ or $\{f(x):x\in A\}$ is the class
\begin{equation*}
  \{y:\Exists x(x\in A\land \psi(x,y)\}.
\end{equation*}
That this class is a \emph{set} is the following:

\begin{axiom}[Replacement]\label{ax:replacement}
The image of a set under a function is a set:  For all classes
 $\{(x,y):\psi(x,y)\}$ that are \emph{functions},
 \begin{equation*}
   \Forall x\Exists y\Forall z\Forall w(z\in x\land\psi(z,w)\to w\in
   y). 
 \end{equation*}
\end{axiom}

Like Comprehension, the Replacement Axiom is a scheme of
$\in$-sentences.  Indeed, for each binary formula $\psi(x,y)$, we have
\begin{equation*}
    \Forall x(\Exists y\psi(x,y)\to\Existsunique y\psi(x,y))\to
\Forall x\Exists y\Forall z\Forall w(z\in x\land\psi(z,w)\to w\in
   y).
\end{equation*}
If we have a function $x\mapsto B_x$ on a set $A$, then the
union \eqref{eqn:indexed-union} above is now well-defined.


Other set-theoretic axioms will arise in the course of the ensuing
discussion.  

\section{Model-theory}
\markright{\sectbegin Model-theory}

A \defn{unary} relation on a set is just a subset.
A unary \defn{operation} on a set is a function from the set to
itself.  A \defn{binary} operation on a set $A$ is a function from
$A\times A$ to $A$.  We can continue.  A \tech{ternary} relation on
$A$ is a subset of
\begin{equation*}
  A\times A\times A,
\end{equation*}
that is, $(A\times A)\times A$, also denoted $A^3$.  A ternary
operation on $A$ is a function from $A^3$ to $A$.  More generally:

\begin{definition}
  The \defn{cartesian powers} of a set $A$ are defined recursively:
  \begin{enumerate}
    \item
$A$ is a cartesian power of $A$.
\item
If $B$ is a cartesian power of $A$, then so is $B\times A$.
  \end{enumerate}
A \defn{relation} on $A$ is a subset of a cartesian power of $A$.  An
\defn{operation} on $A$ is a function from a cartesian power of $A$
into $A$.
\end{definition}
Note that we do not (yet) assert the existence of a \emph{set}
containing the cartesian powers of $A$.

\begin{exercise}
  Is there a \emph{class} containing the cartesian powers of a given
  set and nothing else? 
\end{exercise}

\begin{definition}\label{defn:arb-formula}
  A \defn{structure} is an ordered pair
  \begin{equation*}
    (A,T),
  \end{equation*}
where $A$ is a non-empty set, and $T$ is a set (possibly empty) whose
elements are
operations and relations on $A$ and elements of $A$.  The set $A$ is
the \defn{universe} of the structure.  The structure itself can then
be denoted
\begin{equation*}
  \str A
\end{equation*}
(or just $A$ again).  A \defn{signature} of $\str A$ is a set $S$ of
symbols for the elements of $T$:  This means:
\begin{enumerate}
  \item
There is a bijection $s\mapsto s^{\str A}:S\to T$.
\item
Different structures can have the same signature.
\end{enumerate}
The element $s^{\str A}$ of $T$ is the \defn{interpretation} in $\str
A$ of the symbol $s$.  Usually one doesn't bother to write the
superscript for an interpretation, so $s$ might really mean $s^{\str
  A}$. 
\end{definition}


In the next section, we shall assert as an axiom---the Peano
Axiom---the existence of a structure 
\begin{equation*}
  (\N,\{{}\scr{},0\})
\end{equation*}
having certain properties.  The universe $\N$ will be the set of
\tech{natural numbers}, and $\scr{}$ will be the unary operation
$x\mapsto x+1$.  The structure is more conveniently written as
\begin{equation*}
  (\N,{}\scr{},0);
\end{equation*}
we shall also look at structures $(\N,{}\scr{},0,P)$, where $P$ is a
unary relation on $\N$.


An $\in$-sentence is supposed to be a statement about the world
of (hereditary) sets.  Structures live in this world.  The signature
of a structure allows us to write sentences that are true or false
\emph{in the structure}.  The Peano Axiom will be that certain
sentences of the signatures $\{{}\scr{},0\}$ and $\{{}\scr{},0,P\}$
are \tech{true in}  $(\N,{}\scr{},0)$ and  $(\N,{}\scr{},0,P)$.

\begin{definition}
  The \defn{terms} of  $\{{}\scr{},0\}$ and  $\{{}\scr{},0,P\}$ are
  defined recursively:
\begin{enumerate}
  \item
Variables and names and $0$ are terms.
\item
If $t$ is a term, then so is $\scr t$.
\end{enumerate}
The \defn{atomic} formulas of $\{{}\scr{},0\}$ are equations $t=u$ of
terms; the signature $\{{}\scr{},0,P\}$ also has atomic formulas
\begin{equation*}
  P(t),
\end{equation*}
where $t$ is a term.
From the atomic formulas, formulas are built up as in Definition
\ref{defn:formula}. 
\end{definition}

The definition can be generalized to other signatures.  If for example
the signature has a binary operation-symbol $+$, and $t$ and $u$ are
terms of the signature, then so is $(t+u)$.

\begin{definition}\label{defn:truth}
  An atomic \emph{sentence} $\sigma$ becomes \defn{true} or
  \defn{false in} a
structure $\str A$, once interpretations $c^{\str A}$ are chosen for
any names $c$ appearing in $\sigma$; if $\sigma$ is true in $\str A$,
then we write
\begin{equation}\label{eqn:models}
  \str A\models\sigma,
\end{equation}
and we say that $\str A$ is a \defn{model} of $\sigma$.  Note that
\eqref{eqn:models} could be written out as an $\in$-sentence.  For
arbitrary sentences, we define:
\begin{align*}
  \str A\models\lnot\sigma&\Iff \lnot(\str A\models \sigma),\\
\str A\models\sigma\Bcon\tau&\Iff \str A\models\sigma\Bcon \str
A\models\tau, 
\end{align*}
where $\Bcon$ is $\land$, $\lor$, $\to$ or $\iff$.  Finally,
\begin{equation}\label{eqn:models-forall}
  \str A\models\Forall x\phi(x)
\end{equation}
if and only if $\str A\models\phi(a)$ for all $a$ in $A$; and
\begin{equation*}
  \str A\models\Forall x\phi(x)\Iff \str A\models\lnot\Exists x\lnot
  \phi(x). 
\end{equation*}
\end{definition}

\begin{exercise}\label{exer:var-name}
  In the definition of \eqref{eqn:models-forall}, is $a$ a variable or
  a name?
\end{exercise}


\setcounter{equation}{0}\section{The Peano axioms}\label{sect:Peano}
\markright{\sectbegin The Peano axioms}

The five so-called Peano axioms amount to the following five-part
assertion: 

\begin{axiom}[Peano]
There is a set $\N$,
\begin{enumerate}
\item
containing a distinguished element $0$ (called \defn{zero}), and
  \item
equipped with a unary  operation 
$x\mapsto \scr x$ (the \defn{successor-operation}), such that
\item
%\textnormal{\textbf{(\axz)}}
  $(\N,{}\scr{},0)\models\Forall x\scr x\neq0$;
  \item
%\textnormal{\textbf{(\axu)}}
  $(\N,{}\scr{})\models\Forall x\Forall y(\scr x=\scr y\to x=y)$;
  \item
%\textnormal{\textbf{(\axi)}}
  $(\N,{}\scr{},0,P)\models P(0)\land \Forall x(P(x)\to P(\scr x))\to
  \Forall x P(x)$, for every unary relation $P$ of $\N$. 
\end{enumerate}
\end{axiom}

Thus, in one sense, there is a single `Peano axiom', asserting that a
structure $(\N,{}\scr{},0)$ exists with certain properties.
Its properties are that it satisfies the following three
\tech{axioms}---where now `axiom' is used in a slightly different sense: 
\begin{description}
  \item[\axz]
  $\forall x\qsep \scr x\neq0$;
  \item[\axu]
  $\forall x\qsep\forall y\qsep(\scr x=\scr y\to x=y)$;
  \item[\axi]
  $0\in X\land \forall x\qsep(x\in X\to \scr x\in
  X)\to \forall x\qsep x\in X$, for every subset $X$.
\end{description}
The set-theoretic axioms given in \S~\ref{sect:sets} are supposed to be
true in the mathematical world.  The three axioms just above are
supposed to be true \emph{in a particular structure} in the
mathematical world.  Note that \axi{}, considered as a single
sentence, is not a first-order
sentence, but is \tech{second-order}, since the variable $X$ refers to
sub\emph{sets} of a model, and not to elements.  (\axz{} and \axu{}
are first-order.)

\begin{remark}
In first-order logic, \axi{} is replaced by a \tech{scheme} of axioms,
consisting of one sentence
\begin{equation}\label{eqn:pa}
\phi(0)\land \Forall x(\phi(x)\to \phi(\scr x))\to
  \Forall x \phi(x)  
\end{equation}
for each unary first-order formula $\phi$ in the signature
$\{{}\scr{},0\}$ with parameters.  This scheme of axioms is
\emph{weaker} than \axi, because not every subset of $\N$ is defined
by a first-order formula.  (Later we shall be able to prove this:
There are \tech{countably} many formulas $\phi(x)$, but
$\N$ has \tech{uncountably} many subsets.)  This scheme of axioms
\eqref{eqn:pa}, together with \axz{} and
\axu, might be denoted $\pa$.  It is a consequence of G\"odel's
Incompleteness Theorem that $\pa$ is an \tech{incomplete} theory.
This means that some first-order sentences are true in
$(\N,{}\scr{},0)$, but are not logical consequences of $\pa$.  In
fact, there are models of $\pa$ that are not models of \axi.
\end{remark}

To talk more about the Peano Axioms, we make the following:

\begin{definition}
A natural number is called a \defn{successor} if it is $\scr x$ for
some $x$ in $\N$.  We have special names for certain successors:
\begin{center}
  \begin{tabular}{c||c|c|c|c|c|c|c|c|c}
    $x$ & 0&1&2&3&4&5&6&7&8\\ \hline
$\scr x$ &1&2&3&4&5&6&7&8&9
  \end{tabular}
\end{center}
A natural number $x$ is an \defn{immediate predecessor of} $y$ if
$\scr x=y$.  
\end{definition}

Later we shall define the binary operation $(x,y)\mapsto x+y$ so that
$\scr x=x+1$.

Our names for the Peano Axioms are tied to their meanings (although
these names are not in general use):
\begin{itemize}
  \item
  \axz\ is that \emph{Z}ero is not a successor.
\item
\axu\ is that immediate predecessors are \emph{U}nique when
they exist.
\item
\axi\ is the Axiom of \defn{Induction}:\ a set contains all natural
numbers, provided that it contains $0$ and contains the successor of
each natural number that it contains.
\end{itemize}

Also, \axz\ is that the immediate
predecessor of $0$ does \emph{not} exist.\footnote{Peano did not count
  $0$ as a natural number, so 
  his original axioms included the assertion that $1$ had no immediate
  predecessor.}  \axu{} is that the successor-operation is injective.

We may henceforth write $\N$ instead of $(\N,{}\scr{},0)$.
  As first examples of the Induction Axiom in action, we have:

\begin{lemma}\label{lem:zero-succ}
Every non-zero natural number is a successor.  Symbolically,
\begin{equation*}\N\models\Forall x (x=0\lor\Exists y \scr
  y=x).\end{equation*} 
\end{lemma}

\begin{proof}
Let $A$ be the set of natural numbers comprising $0$ and the
successors.  That is, $A=\{0\}\cup\{x\in\N:\Exists y\scr y=x\}$.
Then $0\in A$ by definition.  Also, if $x\in A$, then 
$\scr x$ is a successor, so $\scr x\in A$.  By induction, $A=\N$.
\end{proof}

\begin{theorem}
  The successor-operation is a bijection between $\N$ and
  $\N\setminus\{0\}$. 
\end{theorem}

\begin{exercise}
  Prove the theorem.
\end{exercise}

\begin{lemma}Every natural number is distinct from its successor:
\begin{equation*}\N\models\Forall x \scr x\neq x.\end{equation*}
\end{lemma}

\begin{proof}
Let $A=\{x\in\N:\scr x\neq x\}$.  Now, $\scr 0$ is a successor and is
therefore distinct from $0$ by \axz.  Hence $0\in A$.  Suppose
$x\in A$.  Then $\scr x\neq x$.  Therefore $\scr{(\scr x)}\neq \scr x$ by the
contrapositive of \axu; so $\scr x\in A$.  By induction, $A=\N$.
\end{proof}

We can spell out \axi{} more elaborately thus: \emph{For every unary
  relation 
  $P$ on $\N$, in order to prove $\N\models\Forall x
  P(x)$, it is enough to prove two things:
  \begin{enumerate}
    \item
$\N\models P(0)$ (the \defn{base step});
\item
$\N\models\Forall x
  (P(x)\to P(\scr x))$ (the \defn{inductive step}), that is,
  $P(\scr x)$ is true under the assumption that $x$ is a natural number
  and $P(x)$ is true.
  \end{enumerate}
}
In the inductive step of a proof, the assumption that $x\in\N$ and
$\N\models P(x)$ is called the \defn{inductive hypothesis}.  In the
proof of Lemma \ref{lem:zero-succ}, the full inductive hypothesis was
not needed; only $x\in\N$ was needed.

\setcounter{equation}{0}\section{Binary operations on natural numbers}
\markright{\sectbegin Binary operations on natural numbers}

To able to say much more about the natural numbers, we should
introduce the usual arithmetic operations.  But how?  We do not need new
axioms; the axioms that we already have are enough to enable us to
\emph{define} the arithmetic operations.

Let's start with \defn{addition}.  This is a binary operation $+$ on
$\N$ whose values can be arranged in an (infinite) matrix as follows,
in which $m+n$ is the entry $(m,n)$, that is, the entry in row $m$
and column $n$, the counts starting at $0$:
\begin{equation*}
  \begin{matrix}
    0 & 1 & 2 & 3 & \cdots\\
1 & 2 & 3 & 4 & \\
 2 & 3 & 4 & 5 &\\
3 & 4 & 5 & 6 &\\
\vdots &&&&\ddots
  \end{matrix}
\end{equation*}
Then row $m$ of this matrix is the sequence of values of a unary
operation $f_m$ on $\N$ such that $f_m(0)=m$ and $f_m(\scr
n)=\scr{f_m(n)}$ for all $n$ in $\N$.  So we can \emph{define} $m+n$
as $f_m(n)$.  To do this rigorously, we need to know two facts:
\begin{enumerate}
  \item
that the functions $f_m$ exist (so that an addition can be defined);
and
\item
that the $f_m$ are unique (so that there is only one addition).
\end{enumerate}
Each of these facts is established by induction, as follows:
\begin{theorem}\label{thm:addition}
  There is a unique binary operation $+$ on $\N$ such
  that $x+0=x$ and
  \begin{equation*}
    x+\scr y=\scr{(x+y)}
  \end{equation*}
for all $x$ and $y$ in $\N$.
\end{theorem}

\begin{proof}
  Let $A$ be the set of natural numbers $x$ for which there is a unary
  operation $f_x$ on $\N$ such that $f_x(0)=x$ and
  \begin{equation*}
    f_x(\scr y)=\scr{f_x(y)}
  \end{equation*}
for all $y$ in $\N$.  We can define $f_0$ by
\begin{equation*}
  f_0(y)=y.
\end{equation*}
So $0\in A$.  Suppose $x\in A$.  Define $f_{\scr x}$ by
\begin{equation*}
  f_{\scr x}(y)=\scr{f_x(y)}.
\end{equation*}
Then $f_{\scr x}(0)=\scr{f_x(0)}=\scr x$, and
\begin{equation*}
  f_{\scr x}(\scr y)=\scr{f_x(\scr y)}=\scr{(\scr{f_x(y)})}=\scr{f_{\scr
  x}(y)}; 
\end{equation*}
so $\scr x\in A$.  By induction, $A=\N$.  This establishes the
\emph{existence} of the desired operation $+$, since
we can define $x+y=f_x(y)$.

For the uniqueness of $+$, it is enough to note the uniqueness of the
functions $f_x$.  If $f'_x$ has the properties of $f_x$, then
$f'_x(0)=x=f_x(0)$, and if $f'_x(y)=f_x(y)$, then $f'_x(\scr
y)=\scr{f'_x(y)}= \scr{f_x(y)}=f_x(\scr y)$.  By induction,
$f'_x=f_x$. 
\end{proof}

\begin{lemma}\label{lem:add}
  $\N$ satisfies
\begin{enumerate}
  \item
  $\Forall x 0+x=x$,
  \item\label{succ-add}
  $\Forall x \Forall y \scr y+x=\scr{(y+x)}$.
\end{enumerate}
\end{lemma}

\begin{exercise}
 Prove the lemma.  (For part (\ref{succ-add}), this can be
  done by showing $\N=\{x:\Forall y
\scr y+x=\scr{(y+x)}\}$.)
\end{exercise}

\begin{theorem}
$\N$ satisfies
\begin{enumerate}\setcounter{enumi}{2}
  \item
  $\Forall x \scr x=x+1$,
  \item
  $\Forall x \Forall y x+y=y+x$ [that is, $+$ is \defn{commutative}],
  \item
  $\Forall x \Forall y \Forall z (x+y)+z=x+(y+z)$ [that
  is, $+$ is   \defn{associative}].
\end{enumerate}
\end{theorem}

\begin{exercise}
  Prove the theorem.
\end{exercise}

We can uniquely define \defn{multiplication} on $\N$ just as we did
addition:  We can show that the multiplication-table
\begin{equation*}
  \begin{matrix}
    0 & 0 & 0 & 0 & \cdots\\
0 & 1 & 2 & 3 & \\
0 & 2 & 4 & 6 & \\
0 & 3 & 6 & 9 & \\
\vdots &&&&\ddots
  \end{matrix}
\end{equation*}
can be written in exactly one way:

\begin{theorem}\label{thm:multiplication}
  There is a unique binary operation $\cdot$ on $\N$ such that $x\cdot
  0=0$ and 
  \begin{equation*}
    x\cdot\scr y=x\cdot y+x
  \end{equation*}
for all $x$ and $y$ in $\N$.
\end{theorem}

\begin{exercise}
  Prove the theorem.
\end{exercise}

Multiplication is also indicated by juxtaposition, so that $x\cdot y$
is $xy$. 

\begin{lemma}
$\N$ satisfies
\begin{enumerate}
  \item
  $\Forall x 0x=0$,
  \item
  $\Forall x \Forall y \scr yx=yx+x$.
\end{enumerate}
\end{lemma}

\begin{exercise}
  Prove the lemma.
\end{exercise}

\begin{theorem}
$\N$ satisfies
\begin{enumerate}\setcounter{enumi}{2}
  \item
  $\Forall x 1x=x$,
  \item
  $\Forall x \Forall y xy=yx$ [that is, $\cdot$ is commutative],
  \item
  $\Forall x \Forall y \Forall z (x+y)z=xz+yz$ [that
  is, $\cdot$ \defn{distributes} over $+$],
  \item
  $\Forall x \Forall y \Forall z (xy)z=x(yz)$ [that is,
  $\cdot$ is associative].
\end{enumerate}
\end{theorem}

\begin{exercise}
  Prove the theorem.
\end{exercise}

In establishing addition and multiplication as operations with the
familiar properties, we used only that $\N$ satisfies the
Induction Axiom.  Other structures satisfy this axiom as well, so they
too have addition and multiplication:

\begin{example}\label{example:3}
  Let $A=\{0,1,2\}$, and define $s:A\to A$ by
  \begin{center}
    \begin{tabular}{c || c | c | c}
$x$ & $0$ & $1$ & $2$\\ \hline
$s(x)$ &  $1$ & $2$ & $0$      
    \end{tabular}
\end{center}
Then $(A,s,0)$ satisfies \axi, so it must have addition and
multiplication---which in fact are given by the matrices
\begin{equation*}
  \begin{matrix}
    0 & 1 & 2\\
1 & 2 & 0\\
2 & 0 & 1
  \end{matrix}\quad\text{ and }\quad
  \begin{matrix}
    0 & 0 & 0\\
0 & 1 & 2\\
0 & 2 & 1
  \end{matrix}
\end{equation*}
But $(A,s,0)$ does not satisfy \axz.
\end{example}

If a structure satisfies \axi, we may say that the structure
\defn{admits (proof by) induction}.\label{page:induction}
So all structures that admit induction have unique operations of addition
and multiplication with the properties given above.

Exponentiation on $\N$ is a binary operation $(x,y)\mapsto x^y$ whose
values compose the matrix
\begin{equation*}
  \begin{matrix}
    1 & 0 & 0 & 0& \cdots\\
1 & 1 & 1 & 1 & \\
1 & 2 & 4 & 8 & \\
1 & 3 & 9 & 27 &\\
\vdots &&&&\ddots
  \end{matrix}
\end{equation*}
The formal properties are that $x^0=1$ and 
\begin{equation*}
  x^{\scr y}=x^y\cdot x
\end{equation*}
for all $x$ and $y$ in $\N$. By induction, there can be no more than
one such operation:

\begin{exercise}
  Prove this.
\end{exercise}

Nonetheless, we shall need more than induction to prove that such an
operation exists at all:

\begin{example}
  In the induction-admitting structure $(A,s,0)$ of Example \ref{example:3}, if
  we try to 
  define exponentiation, we get $2^0=1$, $2^1=2$, $2^2=1$,
  $2^{s(2)}=2^2\cdot 2=2$; but $s(2)=0$, so
  $2^{s(2)}=2^0=1$.  Since $1\neq 2$, our attempt fails.
\end{example}


For any $x$ in $\N$, we want to define $y\mapsto x^y$ as an operation
$g$ such that $g(0)=1$, and $g(\scr n)=g(n)\cdot x$.  We have just
seen that induction is \emph{not} enough to allow us to do this.  In
the next section, we shall see that \tech{recursion} is enough, and
that this is equivalent to \axz, \axu{} and \axi{} together.

\section{Recursion}
\markright{\sectbegin Recursion}

We want to be able to define functions $g$ on $\N$ by specifying
$g(0)$ and by specifying how to obtain $g(\scr n)$ from $g(n)$.  The
next theorem is that we can do this.  The proof is difficult, but the
result is powerful:

\begin{theorem}[Recursion]\label{thm:recursion}
Suppose $B$ is a set with an element $c$.  Suppose $f$ is a unary operation
on $B$.  Then there is a \emph{unique} function
$g:\N\to B$ such that $g(0)=c$ and 
\begin{equation}\label{eqn:recursion}
g(\scr x)=f(g(x))
\end{equation}
for all $x$ in $\N$.
\end{theorem}

\begin{proof}
Let $\family S$ be the set whose members
are the subsets $R$ of $\N\times B$ that have the following two
properties:
\begin{enumerate}\setcounter{enumi}{1}
  \item\label{item:base}
$(0,c)\in R$;
\item\label{item:ind}
$(x,t)\in R\implies(\scr x,f(t))\in R$, for all $(x,t)$ in $\N\times B$.  
\end{enumerate}
So the members of $\family S$ have the properties required of $g$,
except perhaps the property of being a function on $\N$.

The set $\family S$ is non-empty,
since $\N\times B$ itself is in $\family S$.  Let $g$ be the
intersection $\bigcap\family S$.  Then $g\in\family S$ (why?).

We shall show that $g$ is a function with domain
$\N$.  To do this, we shall show by induction that, for all $x$ in
$\N$, there is a unique $t$ in $B$ such that $(x,t)\in g$.

For the base step of our induction, we note first that $(0,c)\in g$.
To finish the base step, we shall show that, for every $t$ in $\N$, if
$(0,t)\in g$, then $t=c$.  Suppose $t\neq c$.  Then neither property
(\ref{item:base}) nor property (\ref{item:ind}) requires $(0,t)$ to be
in a given member of $\family S$.  That is, if $R\in\family S$, then
$R\setminus\{(0,t)\}$ still has these two properties; so, this
set is in $\family S$.  In particular,
$g\setminus\{(0,t)\}\in\family S$. 
But $g$ is the smallest member of $\family S$, so 
\begin{equation*}g\included g\setminus\{(0,t)\},\end{equation*} 
which means $(0,t)\notin g$.  By contraposition, the base step is
complete. 

As an inductive hypothesis, let us suppose that $x\in \N$ and that
there is a unique $t$ in $B$ such that
$(x,t)\in g$.  Then $(\scr x,f(t))\in g$.  To complete our inductive
step, we shall show that, for every $u$ in $B$, if $(\scr
x,u)\in g$, then $u=f(t)$. 
There are two possibilities for $u$:
\begin{enumerate}\setcounter{enumi}{3}
  \item
If $(\scr x,u)=(\scr y,f(v))$
for some $(y,v)$ in $g$, then $\scr x=\scr y$, so $x=y$ by \axu; this
means $(x,v)\in g$, so $v=t$ by inductive hypothesis, and therefore
$u=f(v)=f(t)$. 
\item
If $(\scr x,u)\neq(\scr y,f(v))$
for any $(y,v)$ in $g$, then (as in the base step) $g\setminus\{(\scr
x,u)\}\in\family S$, so $g\included g\setminus\{(\scr x,u)\}$, which
means $(\scr x,u)\notin g$. 
\end{enumerate}
Therefore, if $(\scr x,u)\in g$, then $(\scr x,u)=(\scr y,f(v))$
for some $(y,v)$ in $g$, in which case $u=f(t)$.  Therefore
$f(t)$ is unique such that $(\scr x,f(t))\in g$.

Our induction is now complete; by \axi, we may conclude that
$g$ is a function on $\N$ with the
required properties (\ref{item:base}) and (\ref{item:ind}).  If $h$
is also such a function, then $h\in\family S$, so
$g\included h$, which means $g=h$ since both are functions on
$\N$.  So $g$ is unique. 
\end{proof}

\begin{exercise}
If $g$ and $\family S$ are as in the proof of the Recursion Theorem,
prove that $g\in\family S$.
\end{exercise}

Equation
(\ref{eqn:recursion}) in the statement of Theorem \ref{thm:recursion}
is depicted in the following diagram:  
\begin{equation*}
\begin{CD}
\N @>{\scr{}}>> \N\\
@V{g}VV @VV{g}V\\
B @>>{f}> B
\end{CD}
\end{equation*}
From the $\N$ on the left to the $B$ on the right, there are two
different routes, but each one yields the
same result.

A \defn{definition by recursion}\label{page:recursion} is a definition
of a function on 
$\N$ that is justified by Theorem \ref{thm:recursion}.  Informally,
we can define such a function $g$ by specifying $g(0)$ and by
specifying how $g(\scr x)$ is obtained from $g(x)$.

\begin{remark}
Sections \ref{sect:arith-ops} and \ref{sect:recursion-gen} will
provide several important examples of recursive definitions.  
\end{remark}

\begin{theorem}
  The Induction Axiom is a logical consequence of the Recursion
  Theorem. 
\end{theorem}

\begin{proof}
  Suppose $A\included\N$, and $0\in A$, and $\scr x\in A$ whenever
  $x\in A$.  Using the Recursion Theorem alone, we shall show $A=\N$. 

Let $\B=\{0,1\}$, and define a function $g_0:\N\to\B$ by the rule
\begin{equation*}g_0(x)=
\begin{cases}
  0,&\text{ if }x\in A;\\
1,&\text{ if }x\in \N\setminus A.
\end{cases}
\end{equation*}
Then $g_0$ is a function $g:\N\to\B$ such that $g(0)=0$ and $g(\scr
x)=g(x)$ for all $x$ in $\N$ (why?).  But the function $g_1$ such that
$g_1(x)=0$ for all $x$ in $\N$ is also such a function $g$.  By the
Recursion Theorem, there is only one such function $g$.  Therefore
$g_0=g_1$, so $g_0(x)$ is never $1$, which means $A=\N$.
\end{proof}

\begin{exercise}
  Supply the missing detail in the proof.
\end{exercise}

However, there are models of the Induction Axiom which do not satisfy
the Recursion Theorem:

\begin{example}\label{exam:ind-not-imp-rec}
  Again let $\B=\{0,1\}$, and let $\lnot$ be the unary operation on
 $\B$ such that $\lnot 0=1$ and $\lnot 1=0$.  Then
  $(\B,\lnot,0)$ admits induction, but there is \emph{no}
  function $g:\B\to\N$ such that $g(0)=0$ and $g(\lnot x)={(g(x))}+1$
  for all $x$ in $\B$.
\end{example}

\begin{remark}
Apparently Peano 
  himself did not recognize the distinction between proof by induction
  and definition by recursion; see the discussion in Landau
  \cite[p.~x]{MR12:397m}. 
  Burris \cite[p.~391]{Burris} does not acknowledge the distinction.
  Stoll \cite[p.~72]{MR83e:04002} uses the term `definition by weak
  recursion', although he
  observes that the validity of such a definition does \emph{not
  obviously} follow from the Induction Axiom.  However, Stoll does not
  \emph{prove} (as we have done in Example \ref{exam:ind-not-imp-rec})
  that the Induction Axiom is consistent with the negation of the
  Recursion Theorem.
\end{remark}

\begin{remark}
  The structure $(\B,s,0)$ in Example \ref{exam:ind-not-imp-rec} also
  satisfies \axu, but not \axi.  If we define $t:\B\to \B$ so that
  $t(x)=1$ for each $x$ in $\B$, then $(\B,t,0)$ satisfies the
  Induction Axiom and \axz, but not \axu.  Later (see Remark
  \ref{rem:lim-ord}) we shall have natural
  examples of structures satisfying \axz\ and \axu, but not
  Induction.  We shall also observe (in Remark \ref{rem:u-from-rec})
  that \axu\ is a consequence of the Recursion Theorem.
\end{remark}

\begin{exercise}
  Prove that \axz\ is a consequence of the Recursion Theorem.
\end{exercise}


\setcounter{equation}{0}
\section{Binary operations by recursion}\label{sect:arith-ops}  
\markright{\sectbegin Binary operations by recursion}

The Recursion Theorem guarantees the existence of certain \emph{unary}
functions on $\N$.  As in Theorem \ref{thm:addition}, we can get the
binary operation of addition by obtaining the unary operations $y\mapsto
x+y$.  By recursion, we can define addition as the unique operation
such that
\begin{equation*}
x+0=x\land x+\scr y=\scr{(x+y)}
\end{equation*}
for all $x$ and $y$ in $\N$.  In the same way, we can define
multiplication by
\begin{equation*}
  x\cdot 0=0
    \land x\cdot \scr y=x\cdot y+x.
\end{equation*}

The definition of exponentiation can follow this pattern:

\begin{definition}
The binary operation
  $(x,y)\mapsto x^y$
on $\N$ is given by:
\begin{equation}\label{eqn:exp}
  x^0=1 \land
x^{\scr y}=x^y\cdot x.
\end{equation}
\end{definition}

In fact, we have something a bit more general.  A \defn{monoid} is a
structure $(A,\cdot,1)$ in which $\cdot$ is associative, and $a\cdot
1=a=1\cdot a$ for all $a$ in $A$.  The monoid is \defn{commutative} if
$\cdot$ is commutative.

\begin{theorem}
Suppose $\str A$ is a monoid.  For every
  $y$ in $\N$, there is a unique
  operation $x\mapsto x^y$ on $A$ such that \eqref{eqn:exp}
  holds for all $x$ in $A$ and all $y$ in $\N$.
\end{theorem}

\begin{proof}
  Let $c$ be the
  operation $x\mapsto 1$ on $A$, let $B$ be the
  set of unary operations on $A$,
   and let $f$ be the operation
   \begin{equation*}
     h\longmapsto(x\mapsto h(x)\cdot x)
   \end{equation*}
 on $B$.  By recursion, there is a function $g:\N\to B$
such that $g(0)=c$ and $g(\scr y)=f(g(y))$ for all $y$ in $\N$.  Now
define $x^y=g(y)(x)$.
\end{proof}

\begin{theorem}
  For all $x$ and $w$ in a commutative monoid, and for all
  $y$ and $z$ in $\N$, the following hold: 
  \begin{enumerate}
    \item 
      $x^{y+z}=x^yx^z$;
    \item
      $(x^y)^z=x^{yz}$;
    \item
      $(xw)^z=x^zw^z$.
  \end{enumerate}
\end{theorem}

\begin{exercise}
  Prove the theorem.
\end{exercise}


  The \defn{binomial coefficient} $\binom mn$ is entry $(m,n)$ in the
  following matrix:
  \begin{equation*}
    \begin{matrix}
1 & 0 & 0 & 0 & 0 & \cdots\\
1 & 1 & 0 & 0 & 0 &\\
1 & 2 & 1 & 0 & 0 &\\
1 & 3 & 3 & 1 & 0 &\\
1 & 4 & 6 & 4 & 1 &\\
\vdots &&&&&\ddots
    \end{matrix}
  \end{equation*}

We can give a formal definition by recursion:

\begin{definition}
  The binary operation $(x,y)\mapsto \binom xy$ on $\N$ is given by:
  \begin{equation}\label{eqn:binom}
\textstyle    \binom x0=1\land \binom 0{\scr y}=0\land \binom{\scr
  x}{\scr y}=\binom xy+\binom x{\scr y} 
  \end{equation}
\end{definition}

\begin{exercise}
  Show precisely that this is a valid definition by recursion.
\end{exercise}

As with exponentiation, we can define the binomial coefficients in
a more general setting.  The proof uses the same technique as the
proof of the Recursion Theorem:

  \begin{theorem}
For any structure $(A,{}\scr{},0)$ that satisfies \axu{} and admits
 induction, for every $y$ in $\N$, there is a unique 
  operation $x\mapsto \binom xy$ on $A$ such that\eqref{eqn:binom} holds
for all $x$ in $A$ and all $y$ in $\N$.
\end{theorem}

\begin{proof}
  Let $c$ be the
  operation $x\mapsto 1$ on $A$, and let $B$ be the
  set of unary operations on $A$.  We first prove that, for
  every $h$ in $B$, there is a unique operation $f(h)$ in $B$ given by
  \begin{equation*}
    f(h)(0)=0\land f(h)(\scr x)=h(x)+f(h)(x).
  \end{equation*}
Say $h\in B$, and let $\family S$ be the set whose members are the
subsets $R$ of $A\times A$ such that:
\begin{enumerate}
  \item
$(0,0)\in R$;
\item
$(x,t)\in R\implies (\scr x,h(x)+t)\in R$, for all $(x,t)$ in $A\times
  A$. 
\end{enumerate}
Then $\bigcap \family S$ is the desired operation $f(h)$.  (Why?)  By
recursion, there is a function $g:\N\to B$ such that $g(0)=c$ and
$g(\scr y)=f(g)(y)$ for all $y$ in $\N$.  Now
define $\binom xy=g(y)(x)$.
\end{proof}

\begin{exercise}
  Supply the missing detail in the proof.
\end{exercise}

\begin{exercise}
  Prove that $\binom x1=x$ for all $x$ in $\N$.
\end{exercise}


See also Exercises \ref{exer:binom} and \ref{exer:bin-thm}.

In the proof of the last theorem, it was essential that the
successor-operation on $A$ be injective:

\begin{example}
  Let $A=\{0,1,2\}$, and define $s$ on $A$ by
  \begin{center}
    \begin{tabular}{ c || c | c | c}
$x$ & $0$ & $1$ & $2$\\ \hline
$s(x)$ & $1$ & $2$ & $1$
    \end{tabular}
  \end{center}
If we attempt to define $(x,y)\to\binom xy$ on $A\times\N$, we get a
matrix
\begin{equation*}
  \begin{matrix}
    1 & 0 & 0\\
1 & 1 & 0\\
1 & 2 & 1\\
1 & 1 & 1
  \end{matrix}
\end{equation*}
That is, $\binom 12$ should be both $0$ and $1$.  So our attempt fails.
\end{example}


\setcounter{equation}{0}
\section{The integers and the rational numbers}\label{sect:int&rat} 
\markright{\sectbegin The integers and the rational numbers}

Arithmetic on the integers is determined by arithmetic on the natural
numbers.  Given $\N$, we could just \emph{define} the negative
integers by somehow attaching minus-signs.  A neater approach is
motivated as follows.

For each natural number $a$, we want there to be an integer $x$ such
that 
\begin{equation*}0=a+x.\end{equation*}
Then for each $b$ in $\N$, we should have
\begin{equation*}b=a+b+x.\end{equation*}
By these equations, the pairs $(0,a)$ and $(b,a+b)$ determine the same
integer; so we can define integers to be equivalence-classes of such
pairs.  

\begin{lemma}\label{lem:eq}
  On $\N\times\N$, let $\sim$ be the relation given by
\begin{equation*}(a,b)\sim(c,d)\Iff a+d=b+c.\end{equation*}
Then $\sim$ is an equivalence-relation.  If $(a_0,b_0)\sim(a_1,b_1)$ and
$(c_0,d_0)\sim(c_1,d_1)$, then
\begin{enumerate}
  \item
    $(a_0+c_0,b_0+d_0)\sim(a_1+c_1,b_1+d_1)$;
    \item
      $(b_0,a_0)\sim(b_1,a_1)$;
      \item\label{item:prod}
	$(a_0c_0+b_0d_0,b_0c_0+a_0d_0)\sim(a_1c_1+b_1d_1,b_1c_1+a_1d_1)$.
\end{enumerate}
\end{lemma}

\begin{exercise}
  Prove the lemma.  (For part (\ref{item:prod}), show that each
  member is equivalent to $(a_1c_0+b_1d_0,b_1c_0+a_1d_0)$.)
\end{exercise}

\begin{definition}
Let $\sim$ be as in Lemma \ref{lem:eq}.  We define
  $\Z$ to be $\N\times\N\modsim$.
Let the $\mathord{\sim}$-class of $(a,b)$ be denoted
\begin{equation*}a-b.\end{equation*}
By Lemma \ref{lem:eq},  we can define the operations $+$, $-$ and
$\cdot$ on $\Z$ by the following rules, where $a,b,c,d\in\N$:
\begin{enumerate}
\item\label{item:ambig-sum}
  $(a-b)+^{\Z}(c-d)=(a+^{\N}c)-(b+^{\N}d)$;
  \item
    $-^{\Z}(a-b)=b-a$;
    \item
      $(a-b)\cdot^{\Z}(c-d)=
      (a\cdot^{\N}c+^{\N}b\cdot^{\N}d)-(b\cdot^{\N}c+^{\N}a\cdot^{\N}d)$.  
\end{enumerate}
\end{definition}

  Note that, by the definition, an integer like $5-3$ is \emph{not} the
  natural number $2$; 
  it is not a natural number at all; it is the equivalence-class
  \begin{equation*}\{(2,0),(3,1),(4,2),(5,3),\dots\},\end{equation*} 
which is  $\{(x,y)\in\N^2:x=y+2\}$.

\begin{theorem}
The function
$x\mapsto x-0:\N\to\Z$ is injective and preserves $+$ and $\cdot$,
that is,  
\begin{enumerate}
  \item
    $(x+^{\N}y)-0=(x-0)+^{\Z}(y-0)$;
    \item
      $(x\cdot^{\N}y)-0=(x-0)\cdot^{\Z}(y-0)$
\end{enumerate}
for all $x$ and $y$ in $\N$.
On $\Z$, addition and multiplication are commutative and associative,
and multiplication distributes over addition.  Finally, 
\begin{equation*}x+^{\Z}(-^{\Z}x)=0-0\end{equation*}
for all $x$ in $\Z$.
\end{theorem}

\begin{exercise}
  Prove the theorem.
\end{exercise}

\begin{definition}
On $\Z$, define the binary operation $-$ by
\begin{equation*}x-^{\Z}y=x+^{\Z}(-^{\Z}y).\end{equation*}
\end{definition}

\begin{lemma}
If $x,y\in\N$, then the integer
  $x-y$ is $(x-0)-^{\Z}(y-0)$.
\end{lemma}

Now we can identify the natural numbers with their images in $\Z$,
considering the natural number $x$ to be equal to the integer $x-0$.

We can define the \defn{rational numbers} similarly:

\begin{lemma}\label{lem:Qeq}
  On $\Z\times(\Z\setminus\{0\})$, let $\sim$ be the relation given by
\begin{equation*}
(a,b)\sim(c,d)\Iff ad=bc.
\end{equation*}
Then $\sim$ is an equivalence-relation.  If $(a_0,b_0)\sim(a_1,b_1)$ and
$(c_0,d_0)\sim(c_1,d_1)$, then
\begin{enumerate}
  \item
    $(a_0d_0\pm b_0c_0,b_0d_0)\sim(a_1d_1\pm b_1c_1,b_1d_1)$;
      \item
$(a_0c_0,b_0d_0)\sim(a_1c_1,b_1d_1)$;
\item
      $(b_0,a_0)\sim(b_1,a_1)$ and $(0,a_0)\sim(0,1)$ if $a_0\neq0$.    
\end{enumerate}
\end{lemma}

\begin{exercise}
  Prove the lemma.
\end{exercise}

\begin{definition}
Let $\sim$ be as in Lemma \ref{lem:Qeq}.  We define
  $\Q$ to be $\Z\times(\Z\setminus\{0\})\modsim$.
Let the $\mathord{\sim}$-class of $(a,b)$ be denoted
\begin{equation*}
\frac ab
\end{equation*}
or $a/b$.
By Lemma \ref{lem:Qeq},  we can define the operations $+$, $-$ and
$\cdot$ on $\Q$, and $x\mapsto x\inv$ on $\Q\setminus\{0/1\}$, by the
following rules, where $a,b,c,d\in\Z$: 
\begin{enumerate}
\item
$a/b\pm c/d=(ad\pm bc)/bd$;
  \item
$(a/b)(c/d)=ac/bd$;
\item
$(a/b)\inv=b/a$ if $a\neq0$.
\end{enumerate}
\end{definition}

\begin{theorem}
  The function $x\mapsto x/1:\Z\to\Q$ is injective and preserves $+$,
  $-$ and $\cdot$.  On $\Q$, addition and multiplication are
  commutative and associative,
and multiplication distributes over addition.  Finally, 
\begin{equation*}x\cdot x\inv=\frac 11
\end{equation*}
for all $x$ in $\Q$.
\end{theorem}

\begin{exercise}
  Prove the theorem.
\end{exercise}

Now we can identify the integers with their images in $\Q$,
considering the integer $x$ to be equal to the rational number $x/1$.

\setcounter{equation}{0}
\section{Recursion generalized}\label{sect:recursion-gen}
\markright{\sectbegin Recursion generalized}

How can we define $n$\defn{-factorial}, $(n!)$?  Informally, we write
\begin{equation*}
n!=1\cdot 2\cdot 3\cdots(n-1)\cdot n.
\end{equation*}
For a formal recursive definition, we can try
\begin{equation}\label{eqn:factorial}
  0!=1\land (\scr x)!=\scr x\cdot x!
\end{equation}
---but for this to be valid by the Recursion Theorem, we need an
   operation $f$ on $\N$ so that $f(x!)=(\scr x\cdot x!)$.  Such an
   operation exists, but it is not clear how we can define it before
   we have defined $(x!)$.

The definition \eqref{eqn:factorial} is valid by the following:

\begin{theorem}[Recursion with parameter]\label{thm:recursion-p}
Suppose $B$ is a set with an element $c$.  Suppose $F$ is a function
from $\N\times B$ to $B$.  Then there is a \emph{unique} function
$G:\N\to B$ such that $G(0)=c$ and 
\begin{equation}\label{eqn:str-rec}
G(\scr x)=F(x,G(x))
\end{equation}
for all $x$ in $\N$.
\end{theorem}

\begin{proof}
  Let $f$ be the function 
\begin{equation*}
(x,b)\longmapsto(\scr x,F(x,b)):
\N\times B\longrightarrow\N\times B.
\end{equation*}
By recursion, there is a unique function $g$ from $\N$ to
  $\N\times B$ such that $g(0)=(0,c)$ and 
\begin{equation*}
g(\scr x)=f(g(x))
\end{equation*} 
for all
  $x$ in $\N$.  Now let $G$ be $\pi\circ g$, where $\pi$ is the
  function 
\begin{equation*}
(x,b)\longmapsto b:\N\times B\longrightarrow B.
\end{equation*}  
Then for each $x$ in $\N$ we have $g(x)=(y,G(x))$ for some $y$ in
$\N$.  We can prove by induction that $y=x$.  Indeed, this is the
case when $x=0$, since $g(0)=(0,c)$.  Suppose $g(x)=(x,G(x))$ for some
$x$ in $\N$.  Then
\begin{equation}\label{eqn:strong-rec}
  g(\scr x)=f(x,G(x))=(\scr x,F(x,G(x))).
\end{equation}
In particular, the first entry in the value of $g(\scr x)$ is $\scr x$.  This
completes our induction.  

We now know that $g(x)=(x,G(x))$ for all $x$ in $\N$.  Hence in
particular $g(\scr x)=(\scr x,G(\scr x))$.  But we also have
(\ref{eqn:strong-rec}).  Therefore we have (\ref{eqn:str-rec}),
as desired.  Finally, each of $g$ and $G$ determines the other.  Since
$g$ is unique, so is $G$.
\end{proof}

\begin{example}
  We can define a function $f$ on $\N$ by requiring $f(0)=0$ and
  $f(\scr x)=x$.  This is a valid recursive definition, by Theorem
  \ref{thm:recursion-p}.  Note that $f$ picks out the immediate
  predecessor of a natural number, when this exists.  
\end{example}

\begin{remark}\label{rem:u-from-rec}
In the example, since $f$ is
  unique, we see that \axu\ follows from the Recursion Theorem.
\end{remark}


\begin{definition}
  For any function $f:\N\to M$, where $M$ is a set equipped with addition and
  multiplication, we define the sum 
  $\sum_{k=0}^nf(k)$ and the product $\prod_{k=0}^nf(k)$ recursively
  as follows:
  \begin{itemize}
    \item
      $\displaystyle\sum_{k=0}^0f(k)=f(0)$ and
      $\displaystyle\sum_{k=0}^{\scr
      n}f(k)=\displaystyle\sum_{k=0}^nf(k)+f(\scr n)$; 
      \item
	$\displaystyle\prod_{k=0}^0f(k)=f(0)$ and
	$\displaystyle\prod_{k=0}^{\scr
	n}f(k)=\left(\displaystyle\prod_{k=0}^nf(k)\right)f(\scr n)$. 
  \end{itemize}
\end{definition}

\begin{exercise}
Prove the following.
\begin{enumerate}
  \item
  $\sum_{k=0}^n(k+1)=(n^2+3n+2)/2$
  \item
  $\sum_{k=0}^n(k+1)^2=(2n^3+9n^2+13n+6)/6$
  \item
  $\sum_{k=0}^nb^k=(b^{n+1}-1)/(b-1)$
  \item
  $\sum_{k=0}^n(2k+1)=(n+1)^2$
  \item
  $\prod_{k=0}^n((k+1)/(k+2))=1/(n+2)$
\end{enumerate}
\end{exercise}

\setcounter{equation}{0}
\section{The ordering of the natural numbers}\label{order}
\markright{\sectbegin The ordering of the natural numbers}

In $\N$, if $\scr x=y$, then $x$ is an immediate predecessor of $y$,
and we know that $x$ is unique.  More generally, we should like to say
that $z$ is a \tech{predecessor} of $y$ if $y$ is $\scr z$, or
$\scr{(\scr z)}$, or $\scr{(\scr{(\scr z)})}$, or
$\scr{(\scr{(\scr{(\scr z)})})}$, 
or \dots.  We can take care of the dots using 
recursion.

\begin{definition}\label{dfn:preds}
Let the function $x\mapsto \pis x:\N\to\pow{\N}$ be given by the
rule: 
\begin{equation*}
  \bar 0=\emptyset\land\Forall x\pis{\scr x}=\pis x\cup\{x\}.
\end{equation*}
The elements of $\pis x$ are the \defn{predecessors of $x$}.
\end{definition}

We shall prove in this section that the binary relation
\begin{equation*}\{(x,y):x\in\pis y\}\end{equation*}
on $\N$ is a strict total ordering.  It will be important in
\S\ssp\ref{model} that everything proved in this section is a
consequence of just two facts:
\begin{itemize}
  \item
$\N$ admits induction.
\item
A function  $x\mapsto \pis x:\N\to\pow{\N}$ does exist as given by Definition
\ref{dfn:preds}.
\end{itemize}


We shall have to be precise with the relations $\in$ and $\included$,
which are
\Eng{containment} and \Eng{inclusion} respectively.  The relation
$\pincluded$ is \emph{proper} inclusion (the intersection of
$\included$ and $\neq$).  We have:
\begin{center}
\begin{tabular}{|c@{$\Iff$}c@{$\Iff$}c|}\hline
$x\in A$ & $x$ is an element of $A$ & $A$ contains $x$\\ \hline
$x\included A$ & $x$ is a subset of $A$ & $A$ includes $x$\\ \hline
\end{tabular}
\end{center}

We shall show first that $y\in\pis x\Iff\pis y\pincluded \pis x$ for all $x$
and $y$ in $\N$.

\begin{lemma}\label{lem:el-inc}
$\N$ satisfies
\begin{equation}\label{eqn:in-pinc}
  \Forall y (y\in \pis x\to \pis y\pincluded \pis x\land
  \pis{\scr y}\included \pis x) 
\end{equation}
whenever $x\in\N$.  Hence $\Forall x x\notin\pis x$; also, the
map $x\mapsto \pis x$ is injective.
\end{lemma}

\begin{proof}
 The formula \eqref{eqn:in-pinc} is satisfied when $x=0$.
Suppose it is satisfied when $x=z$.  By contraposition, this means that, since
$\pis z\not\pincluded \pis z$, we have $z\notin \pis z$.  Therefore
$\pis z\pincluded \pis{\scr z}$.
Say $y\in \pis{\scr z}$. Then either $y\in \pis z$ or $y=z$.  In the
former case, 
$\pis y\pincluded \pis z$ by inductive hypothesis.  Hence in either case,
$\pis y\included \pis z$.  Therefore $\pis y\pincluded \pis{\scr z}$;
also, $\{y\}\included
\pis{\scr z}$, so $\pis{\scr y}\included \pis z$.  So \eqref{eqn:in-pinc}
holds when $x=\scr z$.  By induction, \eqref{eqn:in-pinc} holds for all $x$
in $\N$. 

Since $\pis x\not\pincluded\pis x$, we have $x\notin\pis x$, again by the
contrapositive of \eqref{eqn:in-pinc}.  For the injectivity of
$x\mapsto\pis x$, note first that $\pis x=\pis 0\Iff x=0$.  Suppose
$\pis{\scr x}=\pis{\scr y}$, that is, $\pis x\cup\{x\}=\pis
y\cup\{y\}$.  Then either $y\in\pis x$ or $y=x$.  In the first case,
$\pis{\scr y}\included\pis x\pincluded\pis{\scr x}$ (since $x\notin\pis
x$), contradicting $\pis{\scr x}=\pis{\scr y}$; therefore $y=x$.

By Lemma \ref{lem:zero-succ} (whose proof uses only induction), we are
done. 
\end{proof}


\begin{lemma}\label{lem:inc-el}
$\N$ satisfies
\begin{equation}\label{eqn:pinc-in}
  \Forall y (\pis y\pincluded \pis x\to y\in \pis x)
\end{equation}
for each natural number $x$.  
\end{lemma}

\begin{proof}
The formula \eqref{eqn:pinc-in} holds when $x=0$. Suppose
\eqref{eqn:pinc-in} is true when $x=z$.  Say $y\in\N$ and
$\pis y\pincluded \pis{\scr z}$.  Then
$\pis{\scr z}\not\included \pis y$, so $z\notin \pis y$ by Lemma
\ref{lem:el-inc}. Hence
$\pis y\included \pis z$.  If $\pis y\pincluded \pis z$, then $y\in
\pis z$ by
inductive hypothesis.  If $\pis y=\pis z$, then $y=z$, so $y\in\{z\}$.
In either case, $y\in
\pis{\scr z}$. Thus \eqref{eqn:pinc-in} holds when $x=\scr z$.
\end{proof}


\begin{definition}\label{defn:<}
If $x,y\in\N$, we write 
\begin{equation*}x<y\end{equation*} 
instead of $x\in\bar y$; so $<$ is a binary relation on
$\N$.  We write 
\begin{equation*}x\leq y\end{equation*} 
 just in case $x<y\lor x=y$, equivalently, $\pis x\included\pis y$. 
\end{definition}

To prove that $\leq$ has the properties we expect, we need a new
proof-technique: 

\begin{theorem}[Strong Induction]\label{thm:si}
If $A\included \N$, then $\N$ satisfies
\begin{equation}\label{eqn:strong-induction}
  \Forall x(\pis x\included A\to x\in A)\to\Forall
  x x\in A.
\end{equation}
\end{theorem}

\begin{proof}
Suppose $A\included\N$, and $\pis x\included A\implies x\in A$ for all
$x$ in $\N$.  We shall show that $\pis x\included A$ for all $x$ in
$\N$.  This is trivially true when $x=0$, since $\pis 0=\emptyset$.
Suppose $\pis z\included A$.  Then $z\in A$ by assumption, so
\begin{equation*}\pis{\scr z}=\pis z\cup\{z\}\included A.\end{equation*}  
Hence, by induction, $\pis
x\included A$ for all $x$.  In particular, $x\in\pis{\scr x}$, but
$\pis{\scr x}\included A$, so $x\in A$.
\end{proof}


As a consequence of the Strong Induction Theorem, we have the
following method of proof.  \emph{For any unary relation $P$ on $\N$,
in order to prove $\N\models\Forall x
   P(x)$, it is enough to prove one thing:
  \begin{enumerate}
    \item
$\N\models\Forall x
  (\Forall y(y<x\to  P(y))\to P(x))$, that is,
  $x\in P$ under the assumption that $x$ is a natural number
  and every predecessor of $x$ is in $P$.
  \end{enumerate}
}
The assumption that $x\in\N$ and $\Forall y(y<x\to  P(y))$ is
the \defn{strong inductive hypothesis}.


\begin{theorem}
$(\N,\leq)$ is a total order.
\end{theorem}

\begin{proof}
$(\N,\leq)$ is a partial order since
\begin{equation*}x\leq y\Iff\pis x\included \pis y\end{equation*}
for all $x$ and $y$ in $\N$, and $x\mapsto\pis x$ is injective, by the
preceding lemmas. 
It remains to show that $(\N,\leq)$ is a total order, equivalently,
$\N$ satisfies 
  \begin{equation}\label{eqn:total}
    \Forall y(\pis y\not\pincluded \pis x\to \pis x\included \pis y).
  \end{equation}
We shall prove this by strong induction, that is, Theorem
\ref{thm:si}.  Let $A$ be the set of $x$ in $\N$ such that
\eqref{eqn:total} holds.

Suppose $\pis z\included A$, that is, \eqref{eqn:total} holds
whenever $x\in \pis z$.  We shall show that \eqref{eqn:total} holds
when $x=z$.

Suppose $\pis y\not\pincluded \pis z$; we shall show $\pis z\included
\pis y$.  Say $x\in \pis z$.  By strong inductive hypothesis,
\eqref{eqn:total} holds. 
But $\pis x\pincluded \pis z$, by Lemma \ref{lem:el-inc},
so $\pis y\not\pincluded \pis x$, hence $\pis x\included \pis y$ by
\eqref{eqn:total}.  But $\pis x\neq \pis y$, so $\pis x\pincluded \pis
y$, whence
$x\in \pis y$.  Thus $\pis z\included\pis y$.  Therefore
\eqref{eqn:total} holds when $x=z$.  By strong induction, the proof is
complete.  
\end{proof}

%  Note that in any set ordered by $<$, we could define $\pis x$ to be
%  $\{y:y<x\}$.  

\begin{lemma}\label{lem:succ-p-ord}
  $\N\models\Forall x\Forall y(x<y\to\scr
  x<\scr y)$.
\end{lemma}

\begin{exercise}
  Prove the lemma directly (without induction) using the previous
  lemmas. 
\end{exercise}

\begin{theorem}\label{thm:ineq}
  $\N$ satisfies:
  \begin{enumerate}
    \item
$\Forall x0\leq x$;
\item\label{part:cancellation}
$\Forall x\Forall y\Forall z(x<y\leftrightarrow x+z<y+z)$;
\item
$\Forall x\Forall y\Forall z(x<y\to x\cdot\scr
  z<y\cdot\scr z)$;
\item\label{part:subtract}
$\Forall x\Forall y \Exists z(x\leq y\leftrightarrow
  x+z=y)$.
  \end{enumerate}
\end{theorem}

\begin{exercise}
  Prove the theorem.
\end{exercise}

\begin{definition}
  If $x,y\in\N$, and $x\leq y$, then
\begin{equation*}y-x\end{equation*}
is the natural number $z$ (which exists and is unique by Theorem
  \ref{thm:ineq}, parts \pref{part:cancellation} and
  \pref{part:subtract}) such that $x+z=y$.
\end{definition}


\begin{exercise}
Prove the following in $\N$.
\begin{enumerate}
    \item
  $\Forall x\Forall y 1+xy\leq (1+x)^y$
  \item
  $\Forall x(3<x\to x^2<2^x)$
\end{enumerate}
\end{exercise}

\begin{exercise}
Find the flaw in the following argument, where $\max$ is the function
from $\N\times\N$ to $\N$ such that
$\max(x,y)=y$ if $x\leq y$, and otherwise $\max(x,y)=x$. 
\begin{quote}
If
$\max(x,y)=0$, then  
$x=y$.  Suppose that $x=y$ whenever $\max(x,y)=n$.  Suppose
$\max(z,w)=n+1$.  Then $\max(z-1,w-1)=n$, so $z-1=w-1$ by inductive
hypothesis; therefore $z=w$.  Therefore all natural numbers are equal.
\end{quote}
\end{exercise}

\begin{exercise}\label{exer:binom}
  Prove that, if $y\leq x$, then $\displaystyle\binom
  xy=\displaystyle\frac{x!}{y!\,(x-y)!}$. 
\end{exercise}

\begin{exercise}\label{exer:bin-thm}
  Prove the \defn{Binomial Theorem}: 
\begin{equation*}(x+y)^n=\sum_{i=0}^n\binom ni
  x^{n-i}y^i.\end{equation*} 
\end{exercise}

\begin{exercise}
If $x,y\in\N$, we write $x\divides y$ if $\Exists z xz=y$; in
this case we say that $x$ is a \defn{divisor} of $y$.  A natural
number is \defn{prime} if its only divisors are $1$ and itself, and
these are distinct.  Show that every natural number different from $1$
has a prime divisor.
\end{exercise}

\begin{definition}
  If $x\in\N$, then for the set $\pis x$, we may write
\begin{equation*}\{0,\dots,x-1\}.\end{equation*}
Here, the notation $x-1$ has no independent meaning if $x=0$; in this
case, $\{0,\dots,x-1\}=\emptyset$.  For $\pis{\scr x}$, we may write
\begin{equation*}\{0,\dots,x\}.\end{equation*}
Similarly, if $G$ is a function on $\N$, then, recalling the
definition on p.~\pageref{eqn:setim}, we may write
\begin{align*}
  G\setim{\pis x}&=\{G(0),\dots,G(x-1)\},\\
G\setim{\pis{\scr x}}&=\{G(0),\dots,G(x)\}.
\end{align*}
\end{definition}

In this notation, by strong induction, a subset $A$ of $\N$ is equal
to $\N$, provided
\begin{equation*}\{0,\dots,x-1\}\included A\implies x\in A\end{equation*}
for all $x$ in $A$.  This condition is logically
\tech{stronger}---harder to satisfy---than the condition
\begin{equation*}x\in A\implies\scr x\in A\end{equation*}
in the Induction Axiom.  To make this precise, first note that
the following agrees with Definition \ref{defn:<} in case $(X,\leq)$
is $(\N,\leq)$:

\begin{definition}
  If $(X,\leq)$ is a total order, and $x\in X$, let
  \begin{equation*}
    \pis x=\{y\in X:y<x\}.
  \end{equation*}
\end{definition}

In this section, we have \emph{used} the Strong-Induction Theorem to
prove that $(\N,\leq)$ is a total order.  But if we
already have a total order, then we can say that it \defn{admits
  (proof by) strong induction} if it satisfies
\eqref{eqn:strong-induction} of Theorem \ref{thm:si}.

\begin{theorem}
  A structure $\str A$ that admits induction and has a total ordering
  admits strong induction, provided also that
  \begin{equation*}
    x<\scr x
  \end{equation*}
for all $x$ in $A$.
\end{theorem}

\begin{exercise}
  Prove the theorem.
\end{exercise}

Section \ref{sect:well-ordered} will introduce
totally ordered sets in which strong induction works, but ordinary
induction may not.  


\setcounter{equation}{0}

\section{The real numbers}
\markright{\sectbegin The real numbers}

Recall from \S~\ref{sect:int&rat} that every integer is a difference
$x-y$ of two natural 
numbers, and every rational number is a quotient $u/v$ of two integers.

\begin{lemma}
  There is a well-defined subset $P$ of $\Z$ consisting of those
  differences $a-b$ of natural numbers $a$ and $b$ such that $b<a$.
There is a
 unique strict linear ordering $<$ of $\Z$ such that
  \begin{equation*}
    x<y\Iff y-x\in P
  \end{equation*}
for all $x$ and $y$ in $\Z$.  The embedding $x\mapsto x-0:\N\to \Z$
preserves $<$.
\end{lemma}

\begin{lemma}
  There is a well-defined subset $P$ of $\Q$ consisting of those
  quotients $a/b$ of integers $a$ and $b$ such that $0<ab$.
There is a
 unique strict linear ordering $<$ of $\Q$ such that
  \begin{equation*}
    x<y\Iff y-x\in P
  \end{equation*}
for all $x$ and $y$ in $\Q$.  The embedding $x\mapsto x/1:\Z\to \Q$
preserves $<$.
\end{lemma}

\begin{definition}
  A \defn{cut} of a linear order $(X,\leq)$ is a subset $\cut a$ such
  that: 
  \begin{enumerate}
    \item
$\emptyset\pincluded\cut a\pincluded X$;
\item
$x<y\land y\in\cut a\implies x\in \cut a$;
\item
$\Forall y(y<x\to y\in \cut a)\implies x\in \cut a$.
  \end{enumerate}
The set $\R$ of \defn{real numbers} is the set of cuts of $\Q$.
\end{definition}

\begin{exercise}
  Define $+$, $\cdot$ and $<$ on $\R$.  Show that the function
  $x\mapsto\{y\in\Q:y\leq x\}:\Q\to\R$ is an injection that preserves
  $+$, $\cdot$ and $<$.
\end{exercise}

If $a,b\in \R$, then $[a,b)$ is the set $\{x\in \R:a\leq x< b\}$.

\begin{theorem}\label{thm:binary}
  Suppose $\cut a$ in $[0,1)$, there is a unique function $n\mapsto
  a_n:\N\to \{0,1\}$ such that
  \begin{equation*}
    \sum_{k=0}^{n}\frac {a_k}{2^k}\leq
    a<\sum_{k=0}^n\frac{a_k}{2^k}+\frac 1n.
  \end{equation*}
\end{theorem}

\begin{exercise}
  Prove the theorem.
\end{exercise}

\setcounter{equation}{0}
\section{Well-ordered sets}\label{sect:well-ordered} 
\markright{\sectbegin Well-ordered sets}

\begin{definition}
  A total order is called \defn{well-ordered} if every non-empty
  subset has a least element.  The least element of a subset $A$ can
  be denoted 
  \begin{equation*}
      \min A.
  \end{equation*}
\end{definition}

\begin{definition}
  A total order $(X,\leq)$ \defn{admits (definition by) strong
  recursion} if, for every set $B$ and function $h:\pow B\to B$, there
  is a unique function $G:X\to B$ such that
  \begin{equation*}
    G(x)=h(G\setim{\pis x})
  \end{equation*}
for all $x$ in $X$.
\end{definition}

\begin{theorem}\label{thm:well-ordered}
  The following are equivalent conditions on a total order:
  \begin{enumerate}
    \item
It is well-ordered.
\item
It admits strong induction.
\item
It admits strong recursion.
  \end{enumerate}
\end{theorem}

\begin{proof}
Let $(X,\leq)$ be a total order.

  Suppose $(X,\leq)$ is
  well-ordered.  If $A\pincluded X$ and $x=\min (X\setminus A)$, then
  $\pis x\included A$, but $x\notin A$.  By 
  contraposition, if $\pis y\included A\implies y\in A$ for all $y$ in
  $X$, then $A=X$; that is, $(X,\leq)$ admits strong induction. 

Suppose $(X,\leq)$ admits strong induction.  Say $h$ is a function
  from $\pow B$ to $B$.
  Let $A$ be the subset of $X$ consisting of those $x$ for which there
  is a unique function $G_x:\pis x\cup\{x\}\to B$ such that
\begin{equation*}
G_x(y)= h(G_x\setim{\pis y})
\end{equation*}
for all $y$ in $\pis x\cup\{x\}$.  Say $\pis x\included A$.  To show
that $x\in A$, we can define
$G_x:\pis x\cup\{x\}\to B$ by
\begin{equation*}
G_x(y)=
\begin{cases}
  G_y(y),          &\text{ if }y<x;\\
 h(\{G_z(z):z<x\}),&\text{ if }y=x.
\end{cases}
\end{equation*}
Then $G_x$ is a function witnessing that $x\in A$ (why?).  By strong
induction, $A=X$.  Now we can let $G$ be $x\mapsto G_x(x):X\to B$.
This shows that $(X,\leq)$ admits strong recursion.  (Why is $G$
unique?) 

Finally, suppose $(X,\leq)$ is \emph{not} well-ordered.  In
particular, suppose $\emptyset\pincluded A\included X$, but $A$ has
no least element.  Let $\B=\{0,1\}$, and let $h:\pow{\B}\to \B$ be given by
\begin{equation*}
  h(x)=1\Iff 1\in x.
\end{equation*}
For each $i$ in $\B$, let $G_i:X\to \B$ be given by
\begin{equation*}
  G_i(x)=
  \begin{cases}
    0,& \text{ if }x\notin A;\\
    i,& \text{ if }x\in A.
  \end{cases}
\end{equation*}
Then $G_i(x)=h(G_i\setim{\pis x})$ for all $x$ in $X$.  Since there
are two functions $G_i$, the order $(X,\leq)$ does not admit strong
recursion. 
\end{proof}

\begin{exercise}
  Supply the missing details in the last proof.
\end{exercise}

\begin{remark}\label{rem:well-ordered}
  That $X$ is a \emph{set} is not used in the proof of the theorem.
  We shall later (in \S~\ref{sect:ordinals}) consider well-ordered
  \emph{classes}. 
\end{remark}

\begin{corollary}\label{cor:strongrec}
  $(\N,\leq)$ is well-ordered.
In particular,
suppose $B$ is a set, and $h$ is a function from $\pow B$ to $B$.
Then there is a unique function $G:\N\to B$ such that
\begin{equation*}G(x)=h(\{G(0),\dots,G(x-1)\})\end{equation*}
for all $x$ in $\N$.  
\end{corollary}


\setcounter{equation}{0}
\section{A model of the Peano axioms}\label{model}
\markright{\sectbegin A model of the Peano axioms}

We have assumed the existence of natural numbers that satisfy the Peano
axioms.  We have made no assumptions about each natural number in
itself.  Now we shall construct a \emph{model} of the Peano axioms.
We shall be able to describe each element of this model.

Definition \ref{dfn:preds} suggests a way to proceed.  Why not define
the natural numbers so that each one is \emph{identical} to the set of its
predecessors?  We can do this recursively, provided that we have
\emph{some} model $\N$ of the Peano axioms.  Indeed, we can
define a function $f$ on $\N$ by the rule
\begin{equation*}f(0)=\emptyset\land\Forall x f(\scr
  x)=f(x)\cup\{f(x)\}.
\end{equation*} 
Then the range of $f$ determines a model of the Peano axioms in which
$0$ is the empty set, the successor-operation is the map $x\mapsto
x\cup\{x\}$, and each number \emph{is} the set of its predecessors.
The first five elements---namely $0$, $1$, $2$, $3$ and
$4$---of this model are: 
\begin{equation*}\emptyset, 
\{\emptyset\}, 
\{\emptyset,\{\emptyset\}\},
\{\emptyset,\{\emptyset\},\{\emptyset,\{\emptyset\}\}\}, 
\{\emptyset, \{\emptyset\}, \{\emptyset,\{\emptyset\}\},
    \{\emptyset,\{\emptyset\},\{\emptyset,\{\emptyset\}\}\}\}.\end{equation*} 

An alternative way to construct this model is to forget the Peano
 axioms and proceed as follows.  

 \begin{definition}\label{dfn:suc}
   The \defn{successor} of any set is the smallest set that contains
   and includes it.  So, the successor of a set $A$ is the set
\begin{equation*}A\cup\{A\}.\end{equation*}
We shall denote this set by $\vscr A$.
 \end{definition}

Thus, for the moment, the successor of a natural number and the
successor of a set are called by the same word, but have different
symbols. 

We propose to assume:

\begin{axiom}[Infinity]\label{ax:infinity}
There is a set
  $\Omega$ (`Omega') of sets such that $\emptyset\in\Omega$,
  and $\vscr A\in\Omega$ 
  whenever $A\in \Omega$.
\end{axiom}

It seems reasonable to assume that, given a set, we can always form
its successor.  After all, we can do this symbolically, as in
Definition \ref{dfn:suc}.  The Axiom of Infinity is that we---or some
  being---can have started with the
  empty set, and can have repeatedly taken successors \emph{until no
    more successors can be taken.}  Expressed in these terms, the
  Axiom is a philosophically problematic assumption.
Nonetheless, like most (though not all) mathematicians,  we shall make
this assumption.  

In a more benign formulation, the Axiom of Infinity is just that some
set $\Omega$ contains $\emptyset$ and includes the image of itself
under the successor-operation $A\mapsto \vscr A$.  Then we can form
the structure $(\Omega,{}\vscr{},\emptyset)$.  There may be more than
one such set $\Omega$, but the intersection of such sets is still such
a set.  Hence there is a \emph{smallest} such set.  We give it a name:

\begin{definition}
We denote by 
\begin{equation*}\varN\end{equation*} 
(`omega') the smallest set of sets that contains $\emptyset$ and
includes its own image under the successor-operation.
\end{definition}

\begin{exercise}
  Verify that there is exactly one set $\varN$ as given by the
  definition. 
\end{exercise}

\begin{lemma}\label{lem:zi}
  $(\varN,{}\vscr{},\emptyset)$ satisfies the Induction Axiom.
\end{lemma}

\begin{exercise}
  Prove the lemma.
\end{exercise}

\begin{lemma}
  Every element of $\varN$ is a subset of $\varN$.
\end{lemma}

\begin{exercise}
  Prove the lemma.
\end{exercise}

\begin{theorem}
  $(\varN,{}\vscr{},\emptyset)$ is a model of the Peano axioms.
\end{theorem}

\begin{proof}
  It is obvious that $(\varN,{}\vscr{},\emptyset)$ satisfies \axz.  By
  the last lemma, we have a map, namely
\begin{equation*}x\longmapsto x:\varN\To\pow{\varN},\end{equation*}
that takes $\vscr x$ to $x\cup\{x\}$.  Therefore, by
\S\ssp\ref{order}, the structure $(\varN,\included)$ is a total order,
and (because of Lemma \ref{lem:succ-p-ord}), the map $x\mapsto\vscr x$
is injective, that is $(\varN,{}\vscr{})$ satisfies \axu.
\end{proof}

\begin{exercise}
  Write a more detailed proof.
\end{exercise}

If there is one model of the Peano axioms, then there are others.  We
now have a notational distinction: $(\N,{}\scr{},0)$ is an arbitrary model
of the axioms, but $(\varN,{}\vscr{},\emptyset)$ is the specific model
that we have defined.  If we want to be precise, we may refer to the
elements of $\varN$ as the \defn{von Neumann} natural numbers.  In the
rest of these notes though, natural numbers will always be von Neumann
natural numbers.  so we shall have $0=\emptyset$, and $1=\{0\}$, and
so forth.  (Also, in Definition \ref{defn:kplus}, we shall give a new
meaning to the symbol ${}\pl$.)

In one sense, it doesn't matter \emph{which} model of the
Peano axioms we use:

\begin{theorem}\label{thm:isom}
  Every model $(\N,{}\scr{},0)$ of the Peano axioms is uniquely
  \tech{isomorphic}
  to $(\varN,{}\vscr{},\emptyset)$, that is, there is a unique bijection
  $f:\N\to\varN$ such that $f(0)=\emptyset$, and $f(\scr
  x)=\vscr{f(x)}$ for all $x$ in $\N$.
\end{theorem}

\begin{proof}
  By recursion, there is a unique function $f$ on $\N$ such that
  $f(0)=\emptyset$, and $\Forall x f(\scr x)=\vscr{f(x)}$.  then
  $f(0)\in \varN$, and if $f(x)\in\varN$, then $f(\scr x)\in\varN$; so
  $\varN$ includes the range of $f$.  For the same reason, there is a
  function $g:\varN\to \N$ such that $g(\emptyset)=0$ and $g(\vscr
  x)=\scr{g(x)}$.  By induction, $g\circ f$ is the identity on
  $\N$, and $f\circ g$ is the identity on $\varN$.  So $f$ is a
  bijection from $\N$ to $\varN$.
\end{proof}

Nonetheless, as we have seen, the set of von Neumann natural numbers
has the peculiar property that proper inclusion and containment are the same
relation on it, and this relation is the relation $<$ induced by
the Peano axioms.

Since a natural number is now a set of natural numbers, we must be
careful with functional notation.  Suppose for example that
$f:\varN\to\varN$ is the doubling function, $x\mapsto 2\cdot x$.  Then
$f(4)=8$, but $f\setim{4}=f\setim{\{0,1,2,3\}}=\{0,2,4,6\}$.


\setcounter{equation}{0}
\section{Numbers in ordinary language}
\markright{\sectbegin Numbers in ordinary language}

We shall come to understand the von Neumann natural numbers both as
cardinal and as ordinal numbers.

In ordinary languages like Turkish and English (or Latin and
Greek), there is a 
one-to-one correspondence between the cardinal and the ordinal numbers.

Turkish constructs the ordinal numbers from the cardinals by adding the
suffix \Tur{-(\#)nc\#}, where \Tur{\#} is chosen from the set
$\{\text{\Tur{\i, i, u, \"u}}\}$ according to the rules of vowel harmony.
In English, the regular way to get the ordinals from the cardinals is to
add \Eng{-(e)th}, but there are irregularities.  We have:
\begin{center}
\begin{tabular}{|r|c|c|c|c|c|}\hline
 English cardinal: & one & two & three & four & five \\ \hline
 \Tur{T\"urk\c cesi:} & \Tur{bir} & \Tur{iki} & \Tur{\"u\c c} &
 \Tur{d\"ort} & \Tur{be\c s}\\ \hline
 its numeral: & 1 & 2 & 3 & 4 & 5 \\ \hline\hline
 related ordinal: & first & second & third & fourth & fifth \\ \hline
 its abbreviation: & 1st & 2nd & 3rd & 4th & 5th\\ \hline
 \Tur{T\"urk\c cesi:} & \Tur{birinci} & \Tur{ikinci} & \Tur{\"u\c c\"unc\"u} &
 \Tur{d\"ord\"unc\" u} & \Tur{be\c sinci}\\ \hline
 \Tur{k\i saltmas\i}: & 1. & 2. & 3. & 4. & 5. \\ \hline
\end{tabular}
\end{center}

Also, for example, from \Eng{twenty-one} (21) we get \Eng{twenty-first}
(21st), although, historically, the cardinal has been written \Eng{one
and twenty}, with corresponding ordinal \Eng{one-and-twentieth}.

There is evidently no formal connection  between \Eng{one} and
\Eng{first}, or between \Eng{two} and \Eng{second}.  At its roots,
\Eng{first} means \Eng{foremost}, that is, `coming before everything
else.'  Indeed, the \Eng{fir-} in \Eng{first} is related to \Eng{fore}
(as in \Eng{before}), and the \Eng{-st} of \Eng{first} and \Eng{most} is
related to the suffix \Eng{-est} used to form regular superlatives like
\Eng{biggest} and \Eng{soonest}.  Also, \Eng{second} comes from the Latin
\Lat{secvndvs}, meaning `following'.

Thus, it would not do violence to English if we treated zero as
the \emph{first} natural number, and one as the second.  But two as
the \emph{third} number might be strange.  In any case, the word
\Eng{zeroth} (or \Tur{s\i f\i r\i nc\i}) has been coined as a label
for the position of zero on the list of numbers.

\setcounter{equation}{0}
\section{Natural numbers as cardinals}
\markright{\sectbegin Natural numbers as cardinals}

Cardinal numbers name the sizes of sets.  Each natural number (that
is, von Neumann natural number) is a set of a certain size.  So we can
use a natural number as a cardinal number for itself and
for other sets of the same size.  For such a convention to be most
useful, we should make sure that different natural numbers have
different sizes.  To do this, we must be precise about what we mean by
having the same or different sizes.

\begin{definition}
If $A$ and $B$ are sets, then we write:
\begin{enumerate}
  \item
  $A\injects B$, if there is an injection from $A$ into $B$;
  \item
  $A\equip B$, if there is a bijection from $A$ onto $B$;
  \item
  $A\nequip B$, if there is no bijection between $A$ and $B$;
  \item
  $A\prec B$, if $A\injects B$ and $A\nequip B$.
\end{enumerate}
We say that $A$ and $B$ have the \defn{same size}, or are
\defn{equipollent}\footnote{That is, have `equal power'.}, if $A\equip
B$; otherwise, $A$ and $B$ have 
different sizes.  If $A\prec B$, then 
$B$ is \defn{strictly larger than} $A$.
\end{definition}

\begin{lemma}\label{lem:equipbasic}
  On any set of sets, the relation $\equip$ is an
  equivalence-relation, and is a refinement of $\injects$ (that is,
  $A\equip B\implies A\injects B$).  Also, $\injects$ is reflexive and
  transitive. 
\end{lemma}

\begin{exercise}
  Prove the lemma.
\end{exercise}

Certainly $\injects$ is not anti-symmetric, since
$\{1\}\injects\{0\}$, and $\{0\}\injects\{1\}$, but $\{0\}\neq\{1\}$.  We
\emph{shall} show later (in Theorem \ref{thm:Sch--B}) that $\injects$
is anti-symmetric on $\equip$-classes, that is, 
\begin{equation*}A\injects B\land B\injects A\implies
A\equip B.\end{equation*}  
However, this implication is not obvious.
It \emph{is} obvious that $A\included B\implies A\injects B$.

\begin{lemma}
  Distinct natural numbers have different sizes.
\end{lemma}

\begin{proof}
  By Theorem \ref{thm:ineq}, it is enough to show that
  \begin{equation*}\Forall y (x\equip x+y\to y=0)\end{equation*} 
  for all $x$ in $\varN$.  The claim is true if $x=0$,
  since the only function on $\emptyset$ is the empty function.
  Suppose the claim is true when $x=z$.  Say $f:\vscr z\to\vscr z+y$ is
  a bijection.  Then so is $g:z\to z+y$, where
\begin{equation*}g(x)=
  \begin{cases}
    f(x),&\text{ if }f(x)\neq z+y;\\
f(z),&\text{ if } f(x)=z+y.
  \end{cases}
\end{equation*}
Hence $y=0$ by inductive hypothesis.  So the claim is true when
$x=\vscr z$.
\end{proof}

\begin{theorem}\label{thm:injisless}
  On $\varN$, the relation $\injects$ is the total ordering $\leq$.
\end{theorem}

\begin{exercise}
  Prove the theorem.
\end{exercise}

The following is now justified:

\begin{definition}
  If $A\equip n$ for some $n$ in $\varN$, then we call $n$ the
  \defn{cardinality} of $A$, and we write $\size A=n$; we also call
  $A$ a \defn{finite} set.  The natural numbers are the \defn{finite
  cardinal numbers}.
\end{definition}

Note that $\size n=n$ for all $n$ in $\varN$.  We shall ultimately
come up with a definition of $\size A$ for all sets $A$.  This is not
a trivial matter, since some sets are not finite:


\begin{theorem}\label{thm:fin}
Suppose $A\injects B$.  
  If $B$ is finite, then $A$ is finite.
\end{theorem}

\begin{proof}
  It is enough to show that if $n\in\varN$, and $A\included n$, then
  $A$ is finite.

The claim is trivially true if $n=0$.  Suppose it is true when $n=k$.
Say $A\included \vscr k$.  If $A=\vscr k$, then $A$ is finite by
definition.  If $A\included k$, then $A$ is finite by inductive
hypothesis.  In the remaining case, $k\in A$, but there is $m$ in
$k\setminus A$.  Then $A\cup\{m\}\setminus\{k\}\included k$, so the
set is finite by inductive hypothesis.  But $A$ and
$A\cup\{m\}\setminus\{k\}$ are equipollent.
\end{proof}

\begin{theorem}
  A set $A$ is finite if and only if $A\prec\varN$.
\end{theorem}

\begin{proof}
  Suppose $A$ is finite; this means $A\equip n$ for some $n$ in
  $\varN$.  Then $A\prec n+1\injects\varN$ by Theorem
  \ref{thm:injisless}.

Now suppose $A$ is not finite, but $A\injects\varN$.  We may assume
$A\included\varN$.  If $x\in\varN$, then $A\not\included x$, by the
last theorem.  Hence we can define $g:\varN\to A$ by
\begin{equation*}
  g(0)=\min A\land g(\vscr x)=\min(A\setminus \vscr{g(x)}).
\end{equation*}
Then $g(x)<g(\vscr x)$ for all $x$ in $\varN$, so $g$ is injective
(why?).  Therefore $\varN\injects A$, so $A\equip\varN$.
\end{proof}

\begin{exercise}
  Supply the missing detail in the proof.
\end{exercise}

The following is immediate:

\begin{corollary}
  $\varN$ is not finite.
\end{corollary}

\begin{theorem}
  The union of two finite sets is finite.  In fact, if $A$ and $B$ are
  finite, then 
\begin{equation*}
\size{A\cup B}+\size{A\cap B}=\size A+\size B.
\end{equation*}
\end{theorem}

\begin{exercise}
Prove the theorem.
\end{exercise}

\setcounter{equation}{0}
\section{Infinite sets}\label{sect:infinite}
\markright{\sectbegin Infinite sets}

Commonly, an infinite set is simply a non-finite set---a set that is
not finite.  However, another definition is preferable, for
reasons to be mentioned presently.

\begin{definition}
  A set is \defn{infinite} if it is equipollent with a proper subset
  of itself.
\end{definition}

\begin{theorem}\label{thm:inf}
Suppose $A\injects B$.
  If $A$ is infinite, then $B$ is infinite.
\end{theorem}

\begin{proof}
  If  $f:A\to B$ and $g:A\to A$ are injections, then so is $h:B\to B$,
  where
\begin{equation*}h(x)=f(g(y)),\end{equation*}
if $x=f(y)$ for some $y$ in $A$, and otherwise $h(x)=x$.  If $g$ is
not surjective, then neither is $h$.
\end{proof}

Is Theorem \ref{thm:inf} the contrapositive of Theorem \ref{thm:fin}?
Or is there a set that is neither finite nor infinite, or that is both
finite and infinite?

\begin{lemma}\label{lem:inf-model}
  A set $A$ is infinite if and only if it can be equipped with a unary
  operation $s$ such that, for some $a$ in $A$, the structure
  $(A,s,a)$ is a model of \axz\ and \axu.
\end{lemma}

\begin{exercise}
  Prove the lemma.
\end{exercise}

We immediately have:

\begin{theorem}
  $\varN$ is infinite.
\end{theorem}

\begin{lemma}\label{lem:subst}
  Any model $(A,s,a)$ of \axz\ and \axu\ has a substructure that is a
  model of all of the Peano axioms.
\end{lemma}

\begin{exercise}
  Prove the lemma.
\end{exercise}

By the last lemma and Theorem \ref{thm:isom}, we have:

\begin{theorem}\label{thm:smallest}
A set $A$ is infinite if and only if $\varN\injects A$.
\end{theorem}

\begin{corollary}
No set is both finite and infinite.
\end{corollary}

\begin{exercise}
  Prove the theorem and corollary.
\end{exercise}


\begin{remark}
Lemmas \ref{lem:inf-model} and \ref{lem:subst} justify the name of the
Axiom of Infinity.  By this axiom, the infinite set $\varN$ exists.
But if \emph{any} infinite set exists, then a
model of the Peano axioms exists; hence the specific model
$(\varN,{}\vscr{},\emptyset)$ exists, as shown at the beginning of
\S\ssp\ref{model}.  So, as we stated it, the Axiom of Infinity is
equivalent to the assumption that some infinite set exists. 
\end{remark}

We can show that a set $A$ is infinite if we can find an injective
function $G:\varN\to 
A$.  That $G$ is injective means precisely that
\begin{equation*}G(x)\in A\setminus\{G(0),\dots,G(x-1)\}\end{equation*}
for all $x$ in $\varN$.
Now, if $A$ is \emph{not} finite, then in each case the set
\begin{equation*}A\setminus\{G(0),\dots,G(x-1)\}\end{equation*} 
is not empty, so there is some hope
that the function $G$ exists.

\begin{definition}
  A \defn{choice-function} for a set $A$ is a function $f:\pow A\to A$
  such that
  \begin{equation*}
    f(X)\in X
  \end{equation*}
for all $X$ in $\pow A\setminus\{\emptyset\}$.
\end{definition}

\begin{theorem}
  If $A$ has a choice-function, then $A\prec\varN\lor\varN\injects A$. 
\end{theorem}

\begin{proof}
  Suppose $A$ has a choice-function $f$, but $A$ is not finite.  Let
  $h$ be $X\mapsto f(A\setminus X):\pow A\to A$.  By strong recursion,
  there is $G:\varN\to A$ given by
  \begin{equation*}
    G(x)=h(G\setim{\pis x}).
  \end{equation*}
Then $G(x)\notin G\setim{\pis x}$, so $G$ is injective.
\end{proof}

By the theorem, all sets are finite or infinite, provided we assume:

\begin{axiom}[Choice]\label{ax:choice}
Every set has a choice-function.
\end{axiom}

The Axiom of Choice is $\ac$ for short.  There are a number of
equivalent formulations, such as \tech{Zorn's Lemma}.
It is a remarkable result of twentieth-century mathematics
  that neither $\ac$ nor its negation $\ac$ is a consequence of our
  earlier axioms.   

\begin{exercise}
  If all sets are finite or infinite, do you think the Axiom of Choice
  follows? 
\end{exercise}

I shall try to be explicit about when I
use $\ac$.  For example:

\begin{theorem}\label{thm:firstac}
  Every set is either finite or infinite (assuming $\ac$).
\end{theorem}

Using Theorem \ref{thm:smallest}, we can prove:

\begin{theorem}\label{thm:succ-equip}
  If $A$ is infinite, then $\vscr A\equip A$.
\end{theorem}

\begin{proof}
  Let $f:\varN\to A$ be an injection.  Define
  $g:\vscr A\to A$ given by:
\begin{equation*}g(x)=
  \begin{cases}
    f(0),& \text{ if }x=A;\\
x,&\text{ if }x\in A\setminus f\setim{\varN};\\
f(f\inv(x)+1),&\text{ if }x\in f\setim{\varN}.
  \end{cases}\end{equation*}
Then $g$ is a bijection.
\end{proof}

Is the converse of this theorem true?  It is, \emph{if} a set is
always a proper subset of its successor.  Suppose if possible that
$A=\{A\}$.  Then $\size A=1$,  but $A=\vscr A$.  By the following,
such sets do not exist:

\begin{axiom}[Foundation]\label{ax:foundation}
Every non-empty set $A$ contains a set $X$ such that $A\cap
X=\emptyset$.
\end{axiom} 

\begin{theorem}
  If $\vscr A\equip A$, then $A$ is infinite.
\end{theorem}

\begin{proof}
  By the Foundation Axiom, $\{A\}\cap A=\emptyset$, so $A\notin A$,
  which means $A\pincluded\vscr A$. 
\end{proof}

\setcounter{equation}{0}
\section{The ordering of cardinalities}
\markright{\sectbegin The ordering of cardinalities}

Before defining $\size A$ for sets $A$ in general, we can still write
\begin{equation}\label{eqn:card}
  \size A=\size B
\end{equation}
instead of $A\equip B$.
We can think of $\size A$ as \emph{something}, if only an
$\equip$-class of sets; so in (\ref{eqn:card}) we can call $\size A$
the \tech{cardinality} of $A$.  By Lemma \ref{lem:equipbasic}, the
relation $\injects$ induces a relation on cardinalities, so we can
write
\begin{equation*}\size A\leq\size B\end{equation*}
instead of $A\injects B$.

We haven't yet proved that
$\injects$ is a partial ordering of the cardinalities.  This we now
do.

The appropriate name of the following is uncertain:

\begin{theorem}[Schr\"oder--Bernstein]\label{thm:Sch--B}
  $A\injects B\land B\injects A\implies A\equip B$ for all sets $A$
  and $B$.
\end{theorem}

\begin{proof}
  Suppose $f:A\to B$ and $g:B\to A$ are injections.  We recursively
  define a function
\begin{equation*}n\mapsto(A_n,B_n):\varN\to\pow A\times\pow B\end{equation*}
by requiring $(A_0,B_0)=(A,B)$, and
$(A_{n+1},B_{n+1})=(g\setim{B_n},f\setim{A_n})$.
Since $f$ and $g$ are injective, we have
\begin{equation*}f\setim{(A_n\setminus A_{n+1})}= 
f\setim{A_n}\setminus f\setim{A_{n+1}}= B_{n+1}\setminus B_{n+2},\end{equation*}
and likewise $g\setim{(B_n\setminus B_{n+1})}=A_{n+1}\setminus
A_{n+2}$.  Also
\begin{equation*}f\setim{\bigcap\{A_n:n\in\varN\}}=\bigcap\{B_{n+1}:n\in\varN\}.\end{equation*}
Now define $h:A\to B$ by
\begin{equation*}h(x)=
\begin{cases}
  f(x),& \text{ if $x\in A_{2n}\setminus A_{2n+1}$};\\
g\inv(x),& \text{ if $x\in A_{2n+1}\setminus A_{2n+2}$};\\
f(x),& \text{ if $x\in\bigcap\{A_n:n\in\varN\}$}.
\end{cases}\end{equation*}
Then $h$ is a bijection.
\end{proof}

So $\injects$ is anti-symmetric on cardinalities, that is, 
\begin{equation*}
\size A\leq\size B\land\size B\leq\size A\implies\size A=\size B.
\end{equation*}
By Lemma \ref{lem:equipbasic} then, $\injects$ induces a partial
ordering of cardinalities.

From the preceding sections, we know that if $A$ is finite, and $B$ is
infinite, then $\vscr A$ is finite, and
\begin{equation*}A\prec\vscr A\prec\varN\injects B.\end{equation*}
So $\size{\varN}$ is the least infinite cardinality, and is a least
upper bound for the finite cardinalities.  

Does $\injects$ induce a total ordering of cardinalities?  We shall
ultimately (with Theorems \ref{thm:on-wo} and \ref{thm:ch-equip}) show
that it does, by $\ac$.   
First we shall show how to produce, from given sets, strictly larger
sets.  Taking Cartesian products does not generally accomplish this:

\begin{exercise}
  If $A\injects\varN$ and $B\equip\varN$, show that $A\times
  B\equip\varN$. 
\end{exercise}

\setcounter{equation}{0}
\section{Uncountable sets}
\markright{\sectbegin Uncountable sets}

\begin{definition}\label{defn:set-pow}
  If $A$ and $B$ are sets, then 
  \begin{equation*}
      {}^BA
  \end{equation*}
is the set of functions from $B$ to $A$. 
\end{definition}

\begin{lemma}\label{lem:fun-pow}
  $\pow A\equip{}^A2$ for all sets $A$.
\end{lemma}

\begin{proof}
  The function $f\mapsto\{x\in A:f(x)=1\}:{}^A2\to\pow A$ is a
  bijection. 
\end{proof}

We now have several unary operations on sets: 
\begin{enumerate}
  \item
the constant-map
$A\mapsto\emptyset$, 
\item\label{item:succ-map}
the suc\-ces\-sor-map $A\to\vscr A$,
\item\label{item:power-map}
the power-set operation $A\mapsto\pow A$,
\item
the map $A\mapsto {}^AA$, and
\item
the maps $A\mapsto{}^BA$ and $A\mapsto{}^AB$, where $B$ is a set fixed
in advance.
\end{enumerate}
If the sets are hereditary (as ours are), then we also have
$A\mapsto\bigcup A$, and $A\mapsto \bigcup\bigcup A$, and so forth,
and likewise with $\bigcap$.
By Theorem \ref{thm:succ-equip}, operation \pref{item:succ-map} does
not produce 
bigger sets than it starts with, if it starts with infinite sets.
Operation \pref{item:power-map} does produce bigger sets:

\begin{theorem}\label{thm:powergreater}
If $A$ is any set, then $A\prec\pow A$.
\end{theorem}

\begin{proof}
We have an injection $x\mapsto \{x\}:A\to\pow A$, so $A\injects\pow
A$.  Suppose $f$ is an arbitrary 
injection from $A$ into $\pow A$.  Let $B$ be the subset $\{x\in
A:x\notin f(x)\}$ of $A$.  Then $B$ is not in the range of $f$.  For,
suppose $x\in A$.  If $x\in B$, then $x\notin f(x)$, so $B\neq f(x)$.  If
$x\notin B$, then $x\in f(x)$, so again $B\neq f(x)$.  So there is no
bijection between $A$ and $\pow A$.
\end{proof}

If the natural numbers are precisely the counting-numbers, and if a set
should be called `countable' if its elements can be labelled with the
counting-numbers, then the following definition makes sense.

\begin{definition}\label{defn:countable}
If $A\injects \varN$, then $A$ is called
\defn{countable}.  If $A\equip\varN$, then $\varN$ is the
\defn{cardinality} of $A$, that is, $\size A=\varN$.  If $\varN\prec
A$, then $A$ is called \defn{uncountable}. 
\end{definition}

By Theorem \ref{thm:powergreater}, we know that uncountable
sets exist, at least in principle.

\begin{definition}
The set $\R$ of real numbers is called the \defn{continuum}, and its
cardinality is denoted by $\cntm$; that is, $\size{\R}=\cntm$.
\end{definition}

\begin{theorem}
$\cntm=\size{\pow{\varN}}$; in particular, $\R$ is uncountable.
\end{theorem}

\begin{proof}
There is an injection $f:\R\to[0,1)$ given by
  \begin{equation*}
    f(x)=
    \begin{cases}
      (x-1)/(x-2),& \text{ if }x\leq0;\\
1/(x+2),& \text{ if }0\leq x.
    \end{cases}
  \end{equation*}
So it is enough to show $[0,1)\equip{}^{\varN}2$.

Theorem \ref{thm:binary} gives a map $\cut a\mapsto(n\mapsto
a_n):[0,1)\to{}^{\varN}2$.  In fact, this map is a bijection between
  $[0,1)$ and ${}^{\varN}2\setminus A$, where $A$ is the set of
    functions $\sigma:\varN\to 2$ such that, for some $m$ in $\varN$,
    \begin{equation*}
      m\leq n\implies \sigma(n)=1
    \end{equation*}
for all $n$ in $\varN$ (why?).  Hence $[0,1)\injects{}^{\varN}2$.
But the set $A$ is countable (why?).  Therefore we can define an
injection from ${}^{\varN}2$ into $[0,1)$ (how?).  By the
  Schr\"oder--Bernstein Theorem, we are done. 
\end{proof}

\begin{exercise}
  Supply the missing details in the proof.
\end{exercise}

\begin{exercise}
  Show that $\size{{}^{\varN}\varN}=\cntm$.
\end{exercise}

\begin{exercise}
  Show that $\R\times\R\equip\R$.
\end{exercise}

\setcounter{equation}{0}
\section{Ordinal numbers}\label{sect:ordinals}
\markright{\sectbegin Ordinal numbers}

According to the ordinary use of the term, the \tech{ordinal numbers}
should serve as labels for the items on a list 
in such a way that the label determines the position of the item.  In
these notes, list-items are labelled with symbols like ($*$) and
($\dag$) and ($\ddag$); these \emph{distinguish} list-items, but do
not indicate position.  We
shall not define the word \Eng{list}; but we propose:
\begin{enumerate}
  \item
  that every list be well-ordered;
  \item
  that the assignment of ordinals to the items of a list be
  uniquely determined by the ordering of the items.
\end{enumerate}

We shall define ordinals so that they satisfy these requirements, and
so that the natural numbers are ordinals. 

\begin{definition}
  A class is called \defn{transitive} if it properly includes each of
  its elements.  
\end{definition}

\begin{examples}
Each natural number is a transitive set.  The set of natural numbers is
a transitive set. 
The set
\begin{equation*}\{0,1,\{1\}\},\end{equation*}
that is, $\{\emptyset,\{\emptyset\},\{\{\emptyset\}\}\}$, is
a transitive set, but the relation of containment ($\in$) on this set
is not a transitive relation, since $0\in 1$ and $1\in\{1\}$, but
$0\notin\{1\}$.  Containment is a transitive relation on
the set 
\begin{equation*}\{1,\{1\},\{1,\{1\}\}\},\end{equation*}
but this set is not a transitive set, since the
element $1$ is not a subset.
\end{examples}

\begin{lemma}\label{lem:trans}
No transitive set contains itself.  Every transitive set includes the
successor of each of its elements.  The successor of every transitive set
is transitive.
\end{lemma}

\begin{proof}
Suppose $A$ is transitive.  If $B\not\pincluded A$, then $B\notin A$, by
the definition of transitivity; therefore $A\notin A$. If $x\in A$, then
$\{x\}\included A$, but also $x\pincluded A$ by transitivity of $A$, so
that $\vscr x\included A$.  If $y\in \vscr A$, then either $y=A$ or
$y\in A$; in either case, $y\pincluded \vscr A$.  Thus $\vscr A$ is
transitive. 
\end{proof}

\begin{definition}
An \defn{initial segment} of a well-ordered set $(A,\leq)$ is a subset $B$ of
$A$ such that
\begin{equation*}x\leq y\land y\in B\implies x\in B\end{equation*}
for all $x$ and $y$ in $A$.
An initial segment is \defn{proper} if it is not
the whole set. 
\end{definition}

\begin{example}
Every natural number is a proper initial segment of $(\varN, \included)$.
\end{example}

\begin{lemma}\label{pis}
For every proper initial segment $B$ of a well-ordered set $(A,\leq)$,
there is an 
element $x$ of $A$ such that $B=\{y\in A:y<x\}=\pis x$.
\end{lemma}

\begin{proof}
Let $x$ be the least element of $A\setminus B$.
\end{proof}

\begin{definition}
  A transitive set is an \defn{ordinal (number)} if it
  is strictly well-ordered by containment.
\end{definition}

\begin{exercise}
  Show that the ordinals compose a class.
\end{exercise}

\begin{definition}
  The class of ordinals is $\on$.
\end{definition}

\begin{example}
$\varN\included\on$ and $\varN\in\on$.
\end{example}

We shall let lower-case letters from the beginning of the Greek
alphabet---such as $\alpha$, $\beta$, $\gamma$, $\delta$ and
$\zeta$---refer to ordinals.  

\begin{lemma}\label{lem:ord-n-set}
Suppose $\alpha$ is an ordinal, and $x$ is a set.  The following are
equivalent:
\begin{enumerate}
  \item\label{xin}
  $x\in \alpha$
  \item\label{xpis}
  $x$ is a proper initial segment of $\alpha$
  \item\label{xpino}
  $x$ is an ordinal, and $x\pincluded \alpha$
\end{enumerate}
\end{lemma}

\begin{proof}
(\ref{xin})$\implies$(\ref{xpino}):  Suppose $x\in\alpha$.  Then
$x\pincluded\alpha$, by transitivity of $\alpha$.  Hence $x$ is strictly
well-ordered by containment, since $\alpha$ is.  Say $y\in x$. Then
$y\neq x$, since $\in$ is irreflexive on $\alpha$.  Also, if $z\in y$,
then $z\in\alpha$, by transitivity of $\alpha$, so $z\in x$ by
transitivity of $\in$ on $\alpha$.  Thus $y\pincluded x$.  Therefore $x$
is transitive.

(\ref{xpino})$\implies$(\ref{xpis}):  Suppose $x$ is an ordinal, and
$x\pincluded\alpha$.  Say $y\in x$ and $z\in y$.  Then $z\in x$ by
transitivity of $x$.  Hence $x$ is a proper initial segment of $\alpha$.

(\ref{xpis})$\implies$(\ref{xin}):  Suppose $x$ is a proper initial
segment of $\alpha$.  Then
\begin{equation*}
  x=\{z\in\alpha:z\in y\}
\end{equation*}
for some $y$ in $\alpha$, by Lemma \ref{pis}.  But if $z\in y$, then
$z\in\alpha$, by transitivity of $\alpha$.  Hence $x=y$.
\end{proof}

\begin{lemma}\label{lem:cto}
Suppose $\alpha$ and $\beta$ are distinct ordinals such that
$\alpha\notin\beta$.  Then $\beta\in\alpha$.
\end{lemma}

\begin{proof}
Since $\alpha\notin\beta$, we have $\alpha\not\pincluded\beta$ by the
previous Lemma.  Since also $\alpha\neq\beta$, there is an element of
$\alpha\setminus\beta$. Let $\gamma$ be the least element of
$\alpha\setminus\beta$.  Then
$\gamma\included\beta$, but $\gamma\notin\beta$, so
$\gamma\not\pincluded\beta$, and therefore $\gamma=\beta$.
\end{proof}

\begin{theorem}\label{thm:on-wo}
Every class of ordinals is strictly well-ordered by containment.
\end{theorem}

\begin{proof}
Let $\class C$ be a class of ordinals.  Then containment is transitive
on $\class C$, since every element of $\class C$ is transitive.  So
containment is a strict total ordering of $\class C$, by the last two
lemmas.  If $\alpha\in \class C$, then either $\alpha$ is the least 
element of $\class C$, or $\class C\cap\alpha$ has a least element,
which is the least element of $\class C$.
\end{proof}

\begin{corollary}[Burali-Forti Paradox]
  $\on$ is not a set.
\end{corollary}

\begin{proof}
  The class $\on$ is transitive by Lemma \ref{lem:ord-n-set}.  Suppose
  $A$ is a transitive set of ordinals that is strictly well-ordered by
  containment.  Then $A$ is an ordinal, and $A\in\on\setminus A$.  In
  particular, $A\neq\on$.
\end{proof}

As noted in Remark \ref{rem:well-ordered}, being well-ordered, the
class $\on$ admits strong induction and recursion.  It is also said
that $\on$ admits \defn{transfinite} induction and recursion.


\section{Order-types}\label{sect:order-types}
\markright{\sectbegin Order-types}

We are ready to show that the items in every list can be uniquely
labelled by ordinals.

\begin{definition}
Two totally ordered sets \defn{have the same order-type} if they are
isomorphic, that is, there is an
order-preserving bijection between them.
An \defn{order-type} for a \emph{well-ordered} set is an ordinal that is
isomorphic to it.  That is, the ordinal $\alpha$ is an order-type for a
well-ordered set $(A,\leq)$, provided there is a bijection $f:A\to\alpha$
such that
\begin{equation*}
  x<y\to f(x)\in f(y)
\end{equation*}
for all $x$ and $y$ in $A$.
\end{definition}

\begin{lemma}
No well-ordered set has more than one order-type, or has more than one
isomorphism onto its order-type.
\end{lemma}

\begin{proof}
It is enough to show that every ordinal has exactly one order-type,
namely itself, and that the only isomorphism from an ordinal onto itself
is the identity. Suppose $f:\alpha\to\beta$ is a surjective map of
ordinals which is \emph{not} the identity.  Let $\gamma$ be the least
element of $\alpha$ such that $f(\gamma)\neq\gamma$.  If
$f(\gamma)\in\gamma$, then $f(f(\gamma))=f(\gamma)$, by minimality of
$\gamma$.  If $\gamma\in f(\gamma)$, then (by surjectivity of $f$) there
is $\zeta$ in $\alpha$ such that $\gamma\in\zeta$ and $f(\zeta)=\gamma$.
In either case, $f$ is not an isomorphism.
\end{proof}

\begin{theorem}
Every well-ordered set has exactly one order-type.
\end{theorem}

\begin{proof}
Suppose $(A,\leq)$ is a well-ordered set.  Suppose $x\in A$.  If $\pis
x$ has an order-type $\alpha$, let $f$ be the
isomorphism from $\pis x$ onto $\alpha$; if $y\in\pis x$, then $f(y)$ is
an order-type for $\pis y$.

By uniqueness of order-types, if, for every $y$ in $\pis x$, there is an
order-type $f(y)$ for $\pis y$, then the set $\{f(y):y\in\pis x\}$ is
transitive, so it is an ordinal, by Theorem \ref{thm:on-wo}; hence it is the
order-type of $\pis x$.  By strong induction, every proper initial
segment of $A$ has an order-type; the set of these order-types is the
order-type of $A$.
\end{proof}

\section{Kinds of ordinals}
\markright{\sectbegin Kinds of ordinals}

%%\setcounter{theorem}{-1}

Let us feel free to write
\begin{equation*}
  \alpha<\beta,
\end{equation*}
if $\alpha\in\beta$; and $\alpha\leq\beta$, if $\alpha\included\beta$. 

\begin{theorem}
$\vscr\alpha$ is the least ordinal greater than $\alpha$.
\end{theorem}

\begin{proof}
Note first that $\vscr\alpha$ \emph{is} an ordinal.  Also, $\alpha<\vscr\alpha$,
and if $\alpha<\beta$, then $\vscr\alpha\leq\beta$, by Lemma \ref{lem:trans}.
\end{proof}

\begin{corollary}
Every successor-ordinal has a unique predecessor.
\end{corollary}

\begin{proof}
If $\alpha<\beta$, then $\vscr\alpha\leq\beta<\vscr\beta$.  Hence, by Lemma
\ref{lem:cto}, if $\vscr\alpha=\vscr\beta$, then $\alpha=\beta$.
\end{proof}

\begin{definition}
An
ordinal is \defn{positive} if it is not $0$.
A positive ordinal which is not a successor is called a \defn{limit}
ordinal.
\end{definition}

\begin{theorem}
$\varN$ is the least limit ordinal.
\end{theorem}

\begin{proof}
$\varN$ \emph{is} a limit, since $\varN\neq0$, and if $n<\varN$, then
$\vscr n<\varN$ by definition of $\varN$.  So $\varN$ is the least limit
ordinal by Lemma \ref{lem:cto}.
\end{proof}

We can write
\begin{equation*}
  \vscr\varN=\{0,1,2,\dots;\varN\},
\end{equation*}
where the semicolon (;) indicates that $\varN$ is a limit.

\begin{remark}\label{rem:lim-ord}
  If $\alpha$ is a limit ordinal, then $(\alpha,\vscr{},0)$ is a model
  of \axz\ and \axu.  The next section will show that there are limit
  ordinals strictly larger than $\varN$. 
\end{remark}

\section{Cardinality}
\markright{\sectbegin Cardinality}

Every \emph{well-ordered} set is equipollent with some ordinal, by
\S~\ref{sect:order-types}.  \emph{Every} set has a choice-function,
by $\ac$.

\begin{theorem}\label{thm:ch-equip}
Do not assume $\ac$.
A non-empty set has a choice-function if and only if the set is
equipollent with some ordinal. 
\end{theorem}

\begin{proof}
  Suppose $f:\pow A\to A$ is a choice-function for $A$.  By
 strong recursion, for every ordinal $\alpha$, there is a
  unique function $g_{\alpha}:\vscr\alpha\to A$ such that
$$g_{\alpha}(\beta)=f(A\setminus g_{\alpha}\setim{\beta})$$
for all $\beta$ in $\vscr\alpha$.  By definition of a choice-function,
each $g_{\alpha}$ is either injective or surjective.  If $g_{\alpha}$
is always injective, then the function $\alpha\mapsto
g_{\alpha}(\alpha)$ orders a subset of $A$ with the order-type of
$\on$, which is absurd.  So let $\alpha$ be least such that
$g_{\alpha}$ is surjective.  Then $g_{\alpha}$ is a bijection between
$\alpha$ and $A$.

Conversely, if $A\equip\alpha\in\on$, then $X\mapsto\min X$ on $\pow
A\setminus\{\emptyset\}$ extends to a choice-function of $A$.
\end{proof}

\begin{exercise}
  Look up and prove other equivalent forms of $\ac$.
\end{exercise}

The following is consistent with Definition \ref{defn:countable}:

\begin{definition}
  The \defn{cardinality} of a set is the least ordinal that is
  equipollent with it.  The cardinality of $A$ is denoted $\size A$.  An
  ordinal is a \defn{cardinal} if it is the cardinality of some set.
\end{definition}

\begin{theorem}
  Infinite cardinals are limit ordinals.
\end{theorem}

\begin{corollary}
  There are limit ordinals strictly larger than $\varN$ (assuming
  $\ac$.) 
\end{corollary}

\begin{exercise}
  Prove the theorem and its corollary.
\end{exercise}

\begin{exercise}
  Find an example of a limit ordinal that is not a cardinal.
\end{exercise}

\section{The list of cardinals}\label{sect:aleph}
\markright{\sectbegin The list of cardinals}

%\setcounter{theorem}{-1}

Every finite ordinal is a cardinal, but some infinite ordinals, such as
$\vscr\varN$, are not cardinals.  However, for every cardinal there is a
larger cardinal; so---since cardinals are ordinals---there is a
\emph{least} larger.

\begin{definition}\label{defn:kplus}
If $\kappa$ is a cardinal, then $\kappa\pl$ is the least element of
$\{\alpha\in\size{\pow{\kappa}}:\kappa<\size{\alpha}\}$. 
\end{definition}

The following is an instance of \tech{trans-finite recursion}: 

\begin{definition}
The ordinals $\aleph_{\alpha}$ as follows:
\begin{enumerate}
  \item
  $\aleph_0=\varN$,
  \item
  $\aleph_{\vscr{\beta}}=\aleph_{\beta}\pl$,
  \item
  $\aleph_{\delta}=\bigcup_{\gamma<\delta}\aleph_{\gamma}$, if $\delta$ is
  a limit-ordinal.
\end{enumerate}
($\aleph$ is the Hebrew letter \emph{aleph}.)
\end{definition}

\begin{lemma}
The infinite cardinals are precisely the ordinals $\aleph_{\alpha}$, and
\begin{equation*}
  \alpha<\beta\iff\aleph_{\alpha}<\aleph_{\beta}.
\end{equation*}
In particular, the assignment $\alpha\mapsto\aleph_{\alpha}$ is an
order-preserving bijection between $\on$ and the class of infinite
cardinals.
\end{lemma}


\begin{proof}
We first prove that each ordinal $\aleph_\alpha$ is a cardinal, and
\begin{equation*}
  \forall\beta\qsep(\beta<\alpha\to\aleph_\beta<\aleph_\alpha).\tag{$*$}
\end{equation*}
This claim is true when $\alpha=0$.  Suppose it is true when
$\alpha=\gamma$.  Then by definition, $\aleph_{\vscr{\gamma}}$ is the least
ordinal whose cardinality is greater than $\aleph_\gamma$.  In
particular, $\aleph_{\vscr{\gamma}}$ is a cardinal, and
\begin{equation*}
  \aleph_{\gamma}<\aleph_{\vscr{\gamma}}.
\end{equation*}
Hence ($*$) holds when $\alpha=\vscr{\gamma}$.

Now suppose that $\delta$ is a limit ordinal.  Say $\aleph_\alpha$ is a
cardinal, and ($*$) holds, whenever $\alpha<\delta$.  By definition,
$\aleph_\delta$ is the least ordinal that includes each cardinal
$\aleph_\alpha$ such that $\alpha<\delta$.  If an ordinal includes a
cardinal, then its cardinality includes that cardinal; therefore
$\aleph_\delta$ is a cardinal.  Also, if $\beta<\delta$, then
\begin{equation*}
  \aleph_\beta<\aleph_{\vscr{\beta}}\leq\aleph_\delta
\end{equation*}
by inductive hypothesis, since $\beta<\vscr{\beta}<\delta$; so ($*$) holds
when $\alpha=\delta$.

Finally, suppose $\kappa$ is an infinite cardinal.  Let $A$ be the class
\begin{equation*}
  \{\alpha\in\on:\aleph_\alpha<\kappa\}.
\end{equation*}
By the Replacement Axiom, $A$ is a \emph{set}, since it is in
one-to-one correspondence with the subset
\begin{equation*}
  \{\beta<\kappa:\exists\alpha\qsep\aleph_\alpha=\beta\}
\end{equation*}
of $\kappa$.  Hence $A$ is not $\on$.  In particular, there is a least
ordinal $\beta$ such that $\kappa\leq\aleph_\beta$.  Hence
$\aleph_\alpha<\kappa$ when $\alpha<\beta$.  But this property of
$\kappa$ is shared by $\aleph_\beta$, which is the least cardinal with
this property.  Therefore $\kappa=\aleph_\beta$.
\end{proof}

\section{The Continuum Hypothesis}
\markright{\sectbegin The Continuum Hypothesis}

We can now say that $\aleph_1$ is the least or first uncountable
cardinal.  What else can we say about $\aleph_1$?  The \defn{Continuum
Hypothesis} (or $\ch$) is that
\begin{equation*}
  \aleph_1=\cntm.
\end{equation*}
It turns out that, just as $\ac$ is independent of $\zf$, so $\ch$ is
independent of $\zfc$.

Now, calculus can be developed using only $\zfc$.  Therefore calculus
will never answer the question of whether
\begin{equation*}
  \aleph_0<\size{A}<\cntm
\end{equation*}
for some subset $A$ of $\R$.  In a sense, this question has no answer.

In another sense, this question can have whatever answer we like.  We
assume $\ac$ because we can, and because it seems to yield good
mathematics (such as our theorem that all sets are finite or infinite).
In the same way, we could assume $\ch$, or $\lnot\ch$.  Some logicians
are recommending the latter.

Whether $\ch$ is assumed or not, we can make the following.

\begin{definition}
The assignment $\alpha\mapsto\beth_\alpha$ of ordinals to infinite
cardinals is made as follows.
\begin{enumerate}
  \item
  $\beth_0=\varN$;
  \item
  $\beth_{\vscr{\beta}}=\size{\pow{\beth_\beta}}$;
  \item
  $\beth_\delta=\bigcup_{\gamma<\delta}\beth_\gamma$, if $\delta$ is a
  limit ordinal.
\end{enumerate}
($\beth$ is the Hebrew letter \emph{beth}.)
\end{definition}

The Continuum Hypothesis is that
\begin{equation*}
  \aleph_1=\beth_1;
\end{equation*}
the \defn{Generalized Continuum Hypothesis}, or $\mathbf{GCH}$, is that
$\aleph_\alpha=\beth_\alpha$ for all $\alpha$.


\section{Ordinal arithmetic}
\markright{\sectbegin Ordinal arithmetic}

If we have two lists, we can put one after the other to get a new list.
If we have a list of lists, then we can make one big list by first
listing all items of the first list, then listing all items of the second
list, and so forth.  These ideas make sense for well-ordered sets in
general.

\begin{definition}
Let $(A,<)$ and $(B,<)$ be totally ordered sets.  The
\defn{lexicographic} (or \defn{dictionary-}) \defn{order} on $A\times B$
is given by
\begin{equation*}
  (a,b)<(c,d)\iff b<d\lor(b=d\land a<c).
\end{equation*}
\end{definition}

\begin{example}
Say $A$ is the Arabic alphabet, equipped with its alphabetical order. The
lexicographic order on $A\times A$ gives the order in which all
two-letter words would appear in a dictionary.  (The point of using
Arabic is that it is read from right to left.)
\end{example}

\begin{lemma}
If $(A,<)$ and $(B,<)$ are well-ordered sets, then $A\times B$ is
well-ordered by the lexicographic order.
\end{lemma}

\begin{proof}
It is clear that the lexicographic order on $A\times B$ is total.  Say
$C$ is a non-empty subset of $A\times B$.  Let $b$ be the least element
of
\begin{equation*}
  \{y\in B:\exists x\qsep (x\in A\land (x,y)\in C)\},
\end{equation*}
and let $a$ be the least element of $\{x\in A:(x,b)\in C\}$.  Then
$(a,b)$ is the least element of $C$.
\end{proof}

\begin{definition}
The \defn{ordinal sum} $\alpha+\beta$ (the result of \defn{adding}
$\beta$ to $\alpha$) is the order-type of
$$(\alpha\times\{0\})\cup(\beta\times\{1\}),$$
considered as a subset of $(\alpha\cup\beta)\times 2$ with the
lexicographic order.  The \defn{ordinal product} $\alpha\beta$ (the
result of \defn{multiplying} $\alpha$ by $\beta$) is the order-type of
$\alpha\times\beta$.
\end{definition}

We shall see presently that ordinal addition and multiplication, applied
to natural numbers, agree with the operations defined earlier.
In general though, the ordinal operations are not commutative.

\begin{examples}
$\varN+1=\vscr\varN$, and $\varN+\varN=\varN 2$, but $1+\varN=\varN$
and $2\varN=\varN$.
\end{examples}

\begin{theorem}
\
\begin{enumerate}
  \item
  $\vscr\alpha=\alpha+1$
  \item
  $(\alpha+\beta)+\gamma=\alpha+(\beta+\gamma)$
  \item
  $0+\alpha=\alpha+0=\alpha$
  \item
  $0\alpha=\alpha 0=0$
  \item
  $(\alpha\beta)\gamma=\alpha(\beta\gamma)$
  \item
  $1\alpha=\alpha1=\alpha$
  \item
  $\alpha(\beta+\gamma)=\alpha\beta+\alpha\gamma$
\end{enumerate}
\end{theorem}

\begin{exercise}
  Prove the theorem.
\end{exercise}

\begin{corollary}
Applied to natural numbers, the ordinal operations agree with the
earlier definitions.
\end{corollary}

\begin{proof}
It is enough to note:
\begin{align*}
  \alpha+0 & =\alpha, \\
  \alpha+\vscr\beta &
  =\alpha+(\beta+1)=(\alpha+\beta)+1=\vscr{(\alpha+\beta)},\\  
  \alpha 0 &= 0,\\
  \alpha\vscr\beta &=\alpha(\beta+1)=\alpha\beta+\alpha 1=\alpha\beta+\alpha,
\end{align*}
where the operations are the ordinal ones; so these agree with the
operations defined on natural numbers.
\end{proof}


We can now write the following initial segment of $\on$:
\begin{equation*}
  \{0,1,2,\dots;\varN,\varN+1,\varN+2,\dots;
  \varN2,\varN2+1,\dots;\varN3,\dots;\varN\varN\}.
\end{equation*}
Here the ordinals following the semicolons (;) are limits. Note also that
there are limits between $\varN3$ and $\varN\varN$.
 We can
continue the initial segment of $\on$ by writing $\varN\varN$ as
$\varN^2$, and $\varN^2\varN$ as $\varN^3$, and so on; and then we
can write $\varN^\varN$ for the least ordinal that includes the
ordinals $\varN^n$ with $n$ in $\varN$.  Formally, we have the
following, by trans-finite induction:

\begin{theorem}
For any ordinal $\alpha$, there is a unique function $\beta\mapsto
\alpha^{\beta}$ on $\on$ such that:
\begin{enumerate}
  \item
  $\alpha^0=1$;
  \item
  $\alpha^{\beta+1}=\alpha^\beta\alpha$ for all $\beta$ in $\on$;
  \item
  $\alpha^{\delta}=\{\gamma:
  \exists\beta(\beta\in\delta\land\gamma\in\alpha^{\beta})\}$, for all
  limit ordinals $\delta$.
\end{enumerate}
\end{theorem}

\begin{lemma}
Applied to natural numbers, the definition in the theorem agrees with the
definition given in \S~\ref{sect:Peano}.
\end{lemma}

\begin{exercise}
  Prove the lemma.
\end{exercise}

We can now continue or initial segment of $\on$ from where we left off:
\begin{multline*}
  \{\dots;\varN^2,\varN^2+1,\dots;
  \varN^2+\varN,\dots;\varN^22,\dots;\varN^3,\dots;
  \varN^\varN,\dots;\\
  \varN^{\varN2},\dots;\varN^{\varN^2},\dots;\varN^{\varN^\varN},
  \dots;\varN^{\varN^{\varN^\varN}},\dots\}.
\end{multline*}
So we have named a lot of infinite ordinals.  Still, we
have only just begun, since all of them are countable.


\section{Cardinal arithmetic}
\markright{\sectbegin Cardinal arithmetic}

Now let $\kappa$, $\mu$ and $\nu$ be cardinals.  We can refer to
the cardinal $\kappa\pl$ as the \tech{successor} of $\kappa$, but we must
be clear that we mean the successor of $\kappa$ \emph{as a cardinal}.  In
general, $\kappa\pl$ is not $\kappa+1$ unless $\kappa$ is finite.

We can define addition and multiplication of cardinals, though again we
must distinguish these from the corresponding operations on ordinals.

\begin{definition}
Cardinal addition and multiplication are thus:
\begin{itemize}
  \item
  $\kappa+\mu=\size{(\kappa\times\{0\})\cup(\mu\times\{1\})}$;
  \item
  $\kappa\mu=\size{\kappa\times\mu}$.
\end{itemize}
\end{definition}

We can also define powers of cardinals with cardinal exponents, but the
definition is more divergent from the ordinal case.  First note the
following consequence in that case:

\begin{theorem}
$\alpha^{\beta+\gamma}=\alpha^\beta\alpha^\gamma$.
\end{theorem}

\begin{proof}
The claim is true when $\gamma=0$.  If it is true when $\gamma=\zeta$,
then
\begin{equation*}
  \alpha^{\beta+(\zeta+1)}= \alpha^{(\beta+\zeta)+1}=
  \alpha^{\beta+\zeta}\alpha=\alpha^\beta\alpha^\zeta\alpha=\alpha^\beta\alpha^{\zeta+1},
\end{equation*}
so the claim is true when $\gamma=\zeta+1$.  Finally, if it is true when
$\gamma<\delta$, and $\delta$ is a limit ordinal, then, since
$\delta=\bigcup\{\gamma:\gamma<\delta\}$, we have
\begin{align*}
  \alpha^{\beta+\delta} & =\alpha^{\bigcup \{\beta+\gamma:\gamma<\delta\}}\\
  &=\bigcup \{\alpha^{\beta+\gamma}:\gamma<\delta\}\\
  &=\bigcup \{\alpha^{\beta}\alpha^\gamma:\gamma<\delta\}\\
  &=\alpha^{\beta}\left(\bigcup \{\alpha^\gamma:\gamma<\delta\}\right)\\
  &=\alpha^\beta\alpha^\delta.
\end{align*}
The claim follows by trans-finite induction.
\end{proof}

So that the corresponding theorem will hold in the cardinal case, we make
the following.

\begin{definition}
$\kappa^\mu=\size{{}^\mu\kappa}$.  (See Definition \ref{defn:set-pow}.)
\end{definition}

\begin{theorem}
$\kappa^{\mu+\nu}=\kappa^\mu\kappa^\nu$.
\end{theorem}

\begin{proof}
We exhibit a one-to-one correspondence between the set of functions from
$${(\mu\times\{0\})\cup(\nu\times\{1\})}$$
to $\kappa$, and the set ${}^\mu\kappa\times{}^\nu\kappa$.  If $f$ is in
the former, then define $(g,h)$ in the latter by $g(x)=f(x,0)$ and
$h(x)=f(x,1)$.
\end{proof}

Because of Lemma \ref{lem:fun-pow},  we have $\cntm=2^{\aleph_0}$, and
$\beth_{\alpha+1}=2^{\beth_\alpha}$.


\nocite{MR28:2989} 
%\nocite{MR83e:04002}

\markright{}
\bibliographystyle{plain}
\bibliography{../../math/references}
\label{biblio}

 \tableofcontents

\end{document}