bits.tex

\chapter{Data as Bits}
\label{chap:bits}

A computer is a machine for processing and transforming information.
Machines inevitably must operate on things that physically exist, so
in order to process information in a machine, we must \emph{represent}
the information in some physical way.  While we stop short of
discussing precisely the physical phenomena that underlie modern
computers, we will look at the notion of \emph{value encodings}---how
mathematical objects can be represented such that they can be
processed by machines.

\section{A Bit}
\label{sec:bit}

\begin{definition}[Bit]
  A bit (\emph{binary digit}) is a logical state that can represent two possible values,
  which we write as $1$ or $0$.
\end{definition}

Conventionally, and in this text, the two possible values are written
$1$ and $0$, as a reference to their interpretation as numbers, but
this merely a question of notation.  We could equally well have used
\emph{true}/\emph{false}, \emph{a}/\emph{b}, \emph{yes}/\emph{no}, or
any other notation that allows us to distinguish unambiguously between the
two possible values.  By convention, we say that a bit is \emph{set}
when $1$ and \emph{unset} when $0$.

Bits are used in information theory as a unit of information.  For
computers, bits are convenient because any physical phenomenon that
can be interpreted as having two states can be used to represent a
bit.  For example:
\begin{enumerate}
\item High or low voltage in an electrical wire.
\item Absence or presence of a hole in some material.
\item Vertical or horizontal polarity of light.
\item Heads or tails of a coin.
\item Whether a cup is full or empty.
\item Whether a corridor is full of soldier crabs or
  not~\cite{gunji2011robust}.
\end{enumerate}
Some of these representations are more practical than others, but all
are ultimately based on the notion of a \emph{bit}. The choice of
which representation is most practical in a given setting is largely
based on how we can construct machinery that manipulates the physical
representation of the bits. Modern computers are overwhelmingly
electronic, and use \emph{transistors} to transform electrical signals
representing bits, but optical representations are also common for
long-distance communications.

\subsection{Bit Vectors}

Single bits rarely occur in isolation.  Typically we use a sequence of
bits, called a \emph{bit vector}.

\begin{definition}[Bit vector]
  A bit vector $x$ of length $w$ is an ordered sequence of $w$ bits,
  which we write as $x=\bitvector{x_{w-1}\ldots{}x_{0}}$.
\end{definition}

Note the convention that bit $x_{0}$ in a bit vector is written at the
\emph{rightmost} position. This resembles the ordering of digits in
mathematical notation.

\begin{definition}[Concatenation of bit vectors]
  Two bit vectors juxtaposed is interpreted as concatenation and
  defined as follows:
  \[
    \bitvector{x_{n-1}\ldots{}x_{0}} \bitvector{y_{m-1}\ldots{}y_{0}}
    = \bitvector{x_{n-1}\ldots{}x_{0}~y_{m-1}\ldots{}y_{0}}
  \]
\end{definition}


Bit vectors can be interpreted as encoding various mathematical
objects.  We will initially be representing values that look very
similar to bits, and the following may seem unnecessarily long-winded
and ceremonious, but eventually we will look at more complicated
encodings.  The goal is to establish a firm distinction between the
\emph{encoding} of some mathematical object (say, a number) as a bit
vector, and the mathematical object itself.  Such an encoding is
defined by specifying \emph{conversion functions} between the set of
$w$-bit vectors, which we denote $\mathbb{B}^{w}$, and the
mathematical set we wish to encode, say the natural numbers
$\mathbb{N}$.

\subsection{Bit Vectors as Natural Numbers}
\label{sec:bitnats}

One of the most obvious ways to interpret a bit vector is as a number
in base 2. For example, the bit vector $\bitvector{1001}$ might be
interpreted as an encoding of the number $1001_{2} = 9_{10}$. Note the
subscripts used to denote the \emph{radix} of the literals---we will
include these whenever the radix would otherwise be ambiguous.
However, it is important to note that $\bitvector{1001}$ \emph{is not
  the same} as $1001_{2}$! They are mathematical objects in completely
different domains, and it makes no sense to say that they are equal or
unequal. It is merely a quirk of notation that they look similar when
written down, and we shall soon enough see encodings where this is not
the case.

To completely specify the encoding of natural numbers, we must define
how a bit vector of length $w$ is interpreted as a number. Here we do
interpret single bits as numbers, $0$ or $1$, and multiply them with a
\emph{weight}. Specifically, bit number $i$ (counting from right to
left, starting from 0) is multiplied with the weight $2^{i}$, and the
results summed.

\begin{definition}[Bit vector to natural number]
\begin{align*}
    \mathrm{Bits2N}(\bitvector{x_{w-1}\cdots x_{0}}) \defeq \sum_{i=0}^{w-1} x_{i} \cdot{} 2^{i}
\end{align*}
\label{def:bits2n}
\end{definition}

\begin{example}[Interpreting $\bitvector{1101}$ as natural number]
  \[
    \begin{array}{rcl}
      \mathrm{Bits2N}(\bitvector{1101}) & = & 1 \cdot 2^{0} + 0 \cdot 2^{1} + 1 \cdot 2^{2} + 1 \cdot 2^{3} \\
                               & = & 13
    \end{array}
  \]
\end{example}

The conversion of a number to a bit vector is slightly less pleasant,
and is defined by the following recursive function.

\begin{definition}[Natural number to $w$-bit vector]
  \begin{align}
    \mathrm{N2Bits}_{1}(0) \defeq & \bitvector{0} \\
    \mathrm{N2Bits}_{1}(1) \defeq & \bitvector{1} \\
    \mathrm{N2Bits}_{w}(x) \defeq &
                                    \begin{cases}
                                      \bitvector{\mathrm{N2Bits}_{w-1}(\lfloor\frac{x}{2}\rfloor)}\bitvector{0} & \textrm{$x$ is even} \\
                                      \bitvector{\mathrm{N2Bits}_{w-1}(\lfloor\frac{x}{2}\rfloor)}\bitvector{1} & \textrm{$x$ is odd}
                                    \end{cases}
  \end{align}
  \label{def:n2bits}
\end{definition}

Note that $\mathrm{N2Bits}_{w}(x)$ always produces a bit vector with $w$ bits,
no matter the magnitude of $x$.  This is because we tend to work only
with bit vectors of some fixed size. We will return to this in
\cref{sec:words}.

  \begin{example}[Encoding $9$ as bit vector]
\begin{align}
  \mathrm{N2Bits}_{4}(9) &= \mathrm{N2Bits}_{4}(2\cdot{}4 + 1) \\
            &= \mathrm{N2Bits}_{3}(4) \bitvector{1} \\
            &= \mathrm{N2Bits}_{2}(2) \bitvector{1} \bitvector{1} \\
            &= \mathrm{N2Bits}_{1}(1) \bitvector{0} \bitvector{0} \bitvector{1} \\
            &= \bitvector{1} \bitvector{0} \bitvector{0} \bitvector{1} \\
            &= \bitvector{1001}
\end{align}
  \end{example}

This representation can express only non-negative numbers, and is
therefore called \emph{unsigned}, because there can be no leading
minus sign. In \cref{sec:signed-numbers} we will discuss negative
numbers.

When $x \geq 2^{w}$, then $x$ cannot be encoded exactly with a $w$-bit
vector by \cref{def:n2bits}. Intuitively, $x$ is \emph{truncated} and
only the lower $w$ bits of its ``full'' representation is included.
This is an example of \emph{overflow}, which we will return to in
\cref{sec:overflow}. If we wished, we could also have designated a
single distinct bit pattern to encode all numbers that otherwise have
no encoding, which would let us detect anomalous cases. We will see an
example of such an encoding in \cref{chap:floats}.

Having an encoding of numbers is not terribly useful unless we can
also perform \emph{operations}, such as arithmetic, on numbers
represented in the given encoding.  However, before we can do that,
we have to talk about Boolean logic.

\section{Boolean Logic}
\label{sec:boolean-logic}

While numbers are one of the most interesting things we can encode as
bit vectors, we will start out by looking at bits as truth values.
\emph{Truth} as a subject of computation was investigated by the
English mathematician George Boole (1815-1864), whose Boolean logic
far predates the modern notions of computers and bits.

Just as we can define operators on \emph{numbers} (such as addition),
we can define operations on \emph{truth values}, writing $T$ for truth
and $F$ for falsity.  For example, \emph{logical-and}, written
$p \land q$, is true if $p$ and $q$ are both true, and otherwise
false, while \emph{logical-and}, written $p \lor q$, is true if either
operand is true.  The \emph{exclusive-or} operation $p \lxor q$ is
true if exactly one of $p$ and $q$ is true, and \emph{logical
  negation} $\neg p$ is true only if $p$ is false.  This is shown in
\cref{fig:truth-tables}.

\begin{figure}
  \centering

  \begin{displaymath}
\begin{array}{|c c||c|c|c|c|}
p & q & p \land q & p \lor q & p \lxor q & \neg p \\
\hline
T & T & T & T & F & F\\
T & F & F & T & T & F\\
F & T & F & T & T & T\\
F & F & F & F & F & T\\
\end{array}
\end{displaymath}
  \caption{Truth table for \emph{and}, \emph{or}, \emph{exclusive-or}, and negation.}
  \label{fig:truth-tables}
\end{figure}

Since a binary boolean operator can have two possible results ($T$ or
$F$) for a given combination of operands, and there are four possible
combinations of operands, there are $2^{4}=16$ distinct binary
operators on booleans.  Most of these can be written in terms of other
operators.  For example, the \emph{negated-and} operator can be defined as

\begin{align}
  \mathrm{nand}(p,q) \defeq \neg (p \land q)
\end{align}

Interestingly, it turns out that the nand operation is universal, in
that \emph{any} boolean function can be written using a combination of
nand operations.

\begin{example}[Universality of nand]
\begin{align}
  \neg p &= \mathrm{nand}(p,p) \\
  p \land q &= \mathrm{nand}(\mathrm{nand}(p,q),\mathrm{nand}(p,q)) \\
  p \lor q &= \mathrm{nand}(\mathrm{nand}(p,p),\mathrm{nand}(q,q))
\end{align}
\end{example}

\subsection{Boolean Operations as Bits}

The truth values of Boolean logic can easily be encoded as bits - by
treating $1$ as $T$ and $0$ as $F$, we can apply the boolean operators
directly:

\begin{align}
  0 \land 1 &= 0 \\
  0 \lor 1 &= 1 \\
  0 \lxor 1 &= 1 \\
  \neg 0 &= 1
\end{align}

In \cref{sec:bit} we remarked that bits can be represented using any
physical phenomenon capable of two distinct states.  In order to be
practical for \emph{computation}, we must also be able to easily
implement boolean operations on pairs of bits.  As mentioned above,
the nand operation is universal, so if we can show how to implement
it, we can implement any boolean function.

In practice, it turns out that representation of bits as high and low
voltages makes it easy to implement logical operations with
\emph{transistors}.  It is relatively straightforward to create a
\emph{gate} that has two input wires and one output wire, where the
voltage of the output wire depends on the voltages of the input---and
these gates can be made extremely small using modern manufacturing
techniques.  This is why electronic computers have become the most
popular way of implementing computation.

We will not delve further into how logical operations are physically
implemented.  Instead, we will use the logical operators to build ever
more elaborate operations on bit vectors, knowing that as long as we
can express a computation in terms of bit operations, we can
ultimately express it in hardware.  On this humble foundation we will
build ever more elaborate data representations.

\subsection{Words}
\label{sec:words}

One important concession to practicality is that we will operate on
bit vectors of fixed lengths.  While it is possible to build computers
that operate on bit vectors of arbitrary lengths, it is much more
efficient to build circuits that operate on a fixed number of bits at
a time.  Usually these are powers of two---$8,16,32,64$, etc.  In the
computer systems nomenclature, a bit vector of some directly
hardware-supported fixed size is called a \emph{word}.  Generally,
when we use the term \emph{$w$-bit word}, we mean a bit vector
containing $w$ bits.

\begin{definition}[Word]
  A word is a bit vector of some fixed size $w$, on which the computer
  can efficiently operate directly.
\end{definition}

A $w$-bit word can express $2^{w}$ different permutations of bits,
meaning it can represent $2^{w}$ different values.  When deciding on a
value representation, deciding how to best exploit this limited range
is important.  In \cref{def:bits2n} we decided that a $w$-bit word can
represent integers in the range $[0,2^{w}-1]$, which certainly feels
intuitive, but we could just as well have defined a conversion
function that encoded integers in the range $[b,b+2^{w}-1]$ for some
$b$\footnote{In fact, this encoding, known as \emph{biased numbers}
  will make in appearance in \cref{chap:floats}}.  We will also have
to decide what happens when an operation produces a value that lies
outside the representable range.

As a notational convenience, the bitwise operations are extended to
operate elementwise on words as follows:

\begin{definition}[Bitwise operations on words]
  \[
    \begin{array}{rcl}
      \bitvector{x_{w-1}\cdots x_{0}} \land \bitvector{y_{w-1}\cdots y_{0}}
      &\defeq& \bitvector{x_{w-1} \land y_{w-1}\cdots x_{0} \land y_{0}} \\
      \bitvector{x_{w-1}\cdots x_{0}} \lor \bitvector{y_{w-1}\cdots y_{0}}
      &\defeq& \bitvector{x_{w-1} \lor y_{w-1} \cdots x_{0} \lor y_{0}} \\
      \bitvector{x_{w-1}\cdots x_{0}} \lxor \bitvector{y_{w-1}\cdots y_{0}}
      &\defeq& \bitvector{x_{w-1} \lxor y_{w-1}\cdots x_{0} \lxor y_{0}} \\
      \neg \bitvector{x_{w-1}\cdots x_{0}}
      &\defeq& \bitvector{\neg x_{w-1}\cdots \neg x_{0}}
    \end{array}
  \]
  \label{def:wordbitwise}
\end{definition}

This is also the semantics bitwise operations have in programming
languages such as C.

\section{Bit Arithmetic}
\label{sec:bit-arithmetic}

Bit vectors have no inherent meaning.  Though they look like binary
numbers when we write them down on paper, and we saw in
\cref{sec:bitnats} how we can encode natural numbers as bit vectors,
they do not innately ``know'' how to perform arithmetic operations
such as addition.  If we wish to perform an operation on a
mathematical object represented as a bit vector, we must precisely
specify that operation in terms of bit operations.  Our goal is to
specify the operation such that the resulting bit vector, when
decoded, corresponds to the result we would have obtained if we
operated directly on the mathematical objects.  For a mathematical
operation $\odot$, we will write $\BitOdot$ for the equivalent
operation on numbers encoded as bit vectors.

The algorithms for arithmetic we will introduce are largely similar to
those you were hopefully taught for base-10 numbers as a child.  A
large part of learning how to perform binary arithmetic is to
deconstruct what has long since become intuitive and look at the
actual operations we implicitly perform when adding or multiplying
numbers.

\subsection{Addition}
\label{sec:bit-addition}

We wish to define an operation $\BitAdd$ such that
\begin{equation}
  \mathrm{Bits2N}(\mathrm{N2Bits}_{w}(y)\BitAdd{}\mathrm{N2Bits}_{w}(y)) = x + y
\end{equation}

Adding binary numbers is much like adding decimal numbers.  Starting
from the least significant (rightmost) bits, we add them elementwise,
keeping a carry.  Example for adding $x\BitAdd{} y=s$ where
$x=\bitvector{01011}, y=\bitvector{01001}$:

\begin{center}
\begin{tabular}{l|llrr}
  $i$ & $x_{i}$ & $y_{i}$ & $s_{i}$ & $c_{i}$ \\\midrule
  0 & 1 & 1 & 0 & 1 \\
  1 & 1 & 0 & 0 & 1 \\
  2 & 0 & 0 & 1 & 0 \\
  3 & 1 & 1 & 0 & 1 \\
  4 & 0 & 0 & 1 & 0
\end{tabular}
\end{center}

The result is $s=\bitvector{10100}$ with no carry.  In terms of bit
operations, we can express the computation of sums and carries as
follows (recall that $\oplus$ means exclusive-or):
\begin{align}
  s_{0} &= x_{i} \oplus y_{i} \label{eqn:s0} \\
  c_{0} &= x_{i} \land y_{i} \label{eqn:c0} \\
  s_{i} &= x_{i} \oplus y_{i} \oplus c_{i-1} \label{eqn:si} \\
  c_{i} &= (x_{i} \land y_{i})\lor ((x_{i}\lor y_{i})\land c_{i-1}) \label{eqn:ci}
\end{align}

Thus, the definition of $\BitAdd$ is as follows, when adding two
natural numbers represented as a $w$-bit word.
\begin{definition}[Integer addition]
\begin{align*}
  \bitvector{x_{w-1}\cdots{}x_{0}} \BitAdd \bitvector{y_{w-1}\cdots{}y_{0}} \defeq
  \bitvector{s_{w-1}\cdots{}s_{0}}
\end{align*}
where $s_{i},c_{i}$ are as in \cref{eqn:s0,eqn:c0,eqn:si,eqn:ci}.
\label{def:intadd}
\end{definition}

Our definition only covers the case where the two operands have the
same number of bits.  We can always \emph{zero-extend} a $w$-bit word
to a $l+w$-bit word by prepending $l$ bits, without changing the
natural number it encodes via $\mathrm{Bits2N}$.  On the other hand,
\emph{truncation} by removing bits can change the encoded value.

\subsubsection{Overflow}
\label{sec:overflow}

Our definition of $\BitAdd$ accepts and produces $w$-bit words.  What
happens if the resulting number is larger than $2^{w}-1$ and thus
logically requires more than $w$ bits to be represented?  Following
\cref{def:intadd}, we see that the final carry bit $c_{w-i}$ is not
part of the result---if this bit is $1$, then we say that the addition
has \emph{overflowed}.

\begin{definition}[Integer overflow]
  When the result of an integer operation is so large that it does fit
  in the designated word.
\end{definition}

With the integer representation we have used so far, the result of an
overflow \emph{wraps around} back to zero.  This is conceptually
similar to adding $5+5$ in decimal arithmetic and limiting the result
to a single digit.

\begin{example}[$\bitvector{11} \BitAdd \bitvector{10}$]
  \begin{align}
    s_{0} &= 1 \oplus 0 &= 1 \\
    c_{0} &= 1 \land 0 &= 0 \\
    s_{1} &= 1 \oplus 1 \oplus 0 &= 0 \\
    c_{1} &= (1 \land 1) \lor ((1\lor 0) \land 0) &= 1
  \end{align}
  This gives us the final sum
  \begin{align}
    \bitvector{11} \BitAdd \bitvector{10} = \bitvector{01}
  \end{align}
  and since $c_{1}$ is set, the computation has overflowed.
\end{example}

Some programming languages make the last carry bit available as an
\emph{overflow bit} that programmers can check to see if overflow
occurred.  In others, an error is signalled if the overflow bit is set
after an addition.  But in many languages, such as C, overflow
silently occurs and means the program can produce a possibly
unexpected result when working with large numbers.

Does this mean that our carefully specified encoding of natural
numbers as a fixed quantity of bits is simply mathematically wrong,
when even something as simple as addition can give us an unexpected
result?  Not exactly: while no fixed-size encoding can encompass the
infinite natural numbers, our encoding models the \emph{ring
  of natural numbers modulo $2^{w}$}.  Our definition of arithmetic is
\emph{modular arithmetic}, which is mathematically quite well behaved.
In particular, arithmetic has the expected algebraic properties
(associativity, commutativity, $0$ as additive identity, etc).

\subsection{Bit Shifting}
\label{sec:bitshift}

Given a $w$-bit word, we can \emph{shift} the bits of the word left by
$k$ positions, inserting $0$ bits in the newly vacated spots.  Put
another way, we discard the leftmost $k$ bits and append $k$ zeroes to
the end.

\begin{definition}[Logical left-shift by $k$ bits]
  \begin{align*}
  \bitvector{x_{w-1}\cdots{}x_{0}} \BitSll k \defeq
  \bitvector{x_{w-1-k}\cdots{}x_{0} \underbrace{0\cdots 0}_{k}}
  \end{align*}
  \label{def:bits-leftshift}
\end{definition}

Left-shifting by $k$ bits is equivalent to multiplying by $2^{k}$ when
using the encoding of natural numbers from \cref{def:bits2n}.  This is
analogous to multiplying a decimal integer by $10$ by appending $0$.
Note that like addition, this is susceptible to overflow, as we
discard the original $k$ leftmost bits.

\begin{example}[Left-shifting bit vectors with $w=4$]
  \begin{align}
    \mathrm{N2Bits}_{4}(3) \BitSll 2 &&= \bitvector{0011} \BitSll 2 &&= \bitvector{1100} &&= \mathrm{N2Bits}_{4}(12) \\
    \mathrm{N2Bits}_{4}(9) \BitSll 2 &&= \bitvector{1001} \BitSll 2 &&= \bitvector{0100} &&= \mathrm{N2Bits}_{4}(8)
  \end{align}
\end{example}

An obvious dual operation is \emph{right-shifting}, where we drop the
$k$ leftmost bits and prepend $k$ zero bits.

\begin{definition}[Logical right-shift by $k$ bits]
  \begin{align*}
    \bitvector{x_{w-1}\cdots{}x_{0}} \BitSrl k \defeq
    \bitvector{\underbrace{0 \cdots 0}_{k} x_{w-1}\cdots{}x_{k}}
  \end{align*}
  \label{def:bits-rightshift-logical}
\end{definition}

Interpreted as an unsigned number, right-shifting a bit vector by $k$
is equivalent to dividing by $2^{k}$ and then rounding towards zero.
In the C programming languages, shifting is provided with the
operators \texttt{>>} and \texttt{<<}, although with an important
quirk that we will discuss in \cref{sec:arithmetic-right-shift}.

Closely related to shifting is \emph{rotation}, where bits are not
discarded but merely moved to the other end of the word.  We will not
use bit rotation in these notes, but it is often efficiently supported
by computers and has some niche uses.

\subsection{Multiplication}
\label{sec:bit-multiplication}

In the \cref{sec:bitshift} we saw how left-shifting can be used to
multiply by powers of two.  General multiplication is a bit more
involved, but can be done using essentially the same algorithm you
learned in elementary school, only with bits instead of digits.  More
efficient algorithms exist, but are much more complicated.  The
formula for computing the product $z=x\times{}y$ is
\begin{equation}
  z = \sum_{i=0}^{k-1} (x \times y_{i}) \times 2^{i}
\end{equation}
where $y_{i}$ is the $i$th bit of $y$.  Note:
\begin{itemize}
\item The product $x \cdot y_{i}$ is multiplying a number with a
  single bit, meaning the result is either $x$ or $0$, which we can
  compute by using a logical-and on every bit of $x$.
\item The multiplication with $2^{i}$ is with a power of 2, which can
  be done as in \cref{def:bits-leftshift}.
\end{itemize}

This lets us express the formula in terms of bit operations.

\begin{definition}[Integer multiplication]
\begin{align*}
  \bitvector{x_{w-1}\cdots{}x_{0}} \BitMul \bitvector{y_{w-1}\cdots{}y_{0}} \defeq
  \sum_{i=0}^{w-1} \bitvector{x_{w-1} \land y_{i},\ldots,x_{0} \land y_{i}} << i
\end{align*}
where summation is with $\BitAdd$.
\end{definition}

\begin{example}[$5 \times 3$]
  \begin{align}
    \bitvector{0101} \BitMul \bitvector{0011} &=& \sum_{i=0}^{3} \bitvector{0 \land y_{i},1\land y_{i}, 0\land y_{i}, 1 \land y_{i}} << i \\
                                              &=& \bitvector{0 \land 1,1\land 1, 0\land 1, 1 \land 1} << 0 \\\nonumber
                                              &&\BitAdd \bitvector{0 \land 1,1\land 1, 0\land 1, 1 \land 1} << 1 \\\nonumber
                                              &&\BitAdd \bitvector{0 \land 0,1\land 0, 0\land 0, 1 \land 0} << 2 \\\nonumber
                                              &&\BitAdd \bitvector{0 \land 0,1\land 0, 0\land 0, 1 \land 0} << 3 \\
                                              &=&         (\bitvector{0101} << 0) \\\nonumber
                                              &&\BitAdd (\bitvector{0101} << 1) \\\nonumber
                                              && \BitAdd (\bitvector{0000} << 2) \\\nonumber
                                              && \BitAdd (\bitvector{0000} << 3) \\
                                              &=& \bitvector{0101} \BitAdd \bitvector{1010} \BitAdd \bitvector{0000}  \BitAdd \bitvector{0000} \\
    &=& \bitvector{1111}
  \end{align}
\end{example}

\section{Signed Numbers}
\label{sec:signed-numbers}

The number representation discussed thus far can only represent
non-negative numbers, which in computer science jargon are called
\emph{unsigned}.  If we want to handle negative numbers as well, we
need a \emph{signed} representation.

\subsection{Sign-magnitude}
\label{sec:sign-magnitude}

In normal number notation, we turn a number negative by prefixing it
with a minus sign.  Thus, an obvious way to introduce negative numbers
is to treat the most significant (leftmost) bit as a \emph{sign bit},
which is set when the number is negative.  This representation is
known as \emph{sign-magnitude}.  The conversion function is as
follows:

\begin{definition}[Bit vector to integer using sign-magnitude]
\begin{align*}
  \mathrm{SM2Int}(\bitvector{x_{w-1}\cdots x_{0}}) \defeq -1x_{w-1} \times \mathrm{Bits2N}(\bitvector{x_{w-2}\cdots x_{0}})
\end{align*}
\label{def:sm2int}
\end{definition}

Arithmetically negating a number in sign-magnitude representation is
quite simple: just negate the sign bit.

\begin{table}
  \centering
  \[
  \begin{array}{c|r||c|r}
    x & \mathrm{SM2Int}(x) & x & \mathrm{SM2Int}(x) \\\hline
    \bitvector{0000} & 0 & \bitvector{1000} & 0 \\
    \bitvector{0001} & 1 & \bitvector{1001} & -1 \\
    \bitvector{0010} & 2 & \bitvector{1010} & -2 \\
    \bitvector{0011} & 3 & \bitvector{1011} & -3 \\
    \bitvector{0100} & 4 & \bitvector{1100} & -4 \\
    \bitvector{0101} & 5 & \bitvector{1101} & -5 \\
    \bitvector{0110} & 6 & \bitvector{1110} & -6 \\
    \bitvector{0111} & 7 & \bitvector{1111} & -7 \\
  \end{array}
  \]
  \caption{All possible four-bit words interpreted as integers using
    sign-magnitude representation.}
  \label{tab:sign-mag}
\end{table}

Although simple, this representation has the downside that it contains
two representations of zero, as seen in \cref{tab:sign-mag}.  While
having multiple representations of the same value is not inherently
wrong, it is slightly wasteful, and complicates the definition of
arithmetic.  In particular, arithmetic and comparisons of
sign-magnitude numbers is somewhat involved because we have to treat
the sign bit specially.  For these reasons, sign-magnitude is not used
for integers in most modern computers.

\subsection{Two's complement}
\label{sec:twos-complement}

The overwhelmingly most common integer representation in modern
computers is \emph{Two's Complement}.  In this representation, a
negative number is encoded by the logically negated bit sequence of
the corresponding unsigned positive number, plus one.  This lets us
define an encoding function.

\begin{definition}[Integer to Two's Complement]
\begin{align*}
  \mathrm{Z2TC}_{w}(x) =
  \begin{cases}
    \mathrm{N2Bits}_{w}(x) & x \geq 0 \\
    \neg \mathrm{N2Bits}_{w}(|x|) \BitAdd \mathrm{N2Bits}_{w}(1) & x < 0
  \end{cases}
\end{align*}
\label{def:z2tc}
\end{definition}

Note that we are using word negation (\cref{def:wordbitwise}) in the
above formula.

An equivalent view of Two's Complement is that it represents integers
by assigning each bit a weight, just like with unsigned numbers, but
assigns the most significant bit (the \emph{sign bit}) a large
\emph{negative} weight.  This is the intuition we use in our decoding
function.

\begin{definition}[Two's Complement to Integer]
\begin{align*}
  \mathrm{TC2Z}(\bitvector{x_{w-1}\cdots x_{0}}) \defeq -x_{w-1}\cdot 2^{w-1} + \sum_{i=0}^{w-2} x_{i} \cdot{} 2^{i}
\end{align*}
\label{def:tc2Z}
\end{definition}

\begin{table}
  \centering
  \[
  \begin{array}{c|r||c|r}
    x & \mathrm{TC2Int}(x) & x & \mathrm{TC2Int}(x) \\\hline
    \bitvector{0000} & 0 & \bitvector{1000} & -8 \\
    \bitvector{0001} & 1 & \bitvector{1001} & -7 \\
    \bitvector{0010} & 2 & \bitvector{1010} & -6 \\
    \bitvector{0011} & 3 & \bitvector{1011} & -5 \\
    \bitvector{0100} & 4 & \bitvector{1100} & -4 \\
    \bitvector{0101} & 5 & \bitvector{1101} & -3 \\
    \bitvector{0110} & 6 & \bitvector{1110} & -2 \\
    \bitvector{0111} & 7 & \bitvector{1111} & -1 \\
  \end{array}
  \]
  \caption{All possible four-bit words interpreted as integers using
    Two's Complement representation.}
  \label{tab:twos-complement}
\end{table}

Using Two's Complement, all distinct bit vectors represent distinct
integers, as demonstrated on \cref{tab:twos-complement}.  But this
representation also has other useful properties.  One almost
miraculous property is that we can add and multiply numbers in Two's
Complement representation the exact same way that we add unsigned
numbers using \cref{def:intadd}, and get the right result (ignoring
overflow).  For non-negative numbers, this is not terribly surprising,
as these have identical representations in unsigned and Two's
Complement representation.  For negative numbers, the reason this
works is that negative numbers in Two's Complement maintain their
relative ordering when interpreted as unsigned numbers, which is not
the case for sign-magnitude.  It is likely that it is this property
that has made Two's Complement the dominant integer representation, as
it means a computer designer can use the same circuitry for computing
signed and unsigned numbers.  Only when relatively ordering Two's
Complement numbers does a computer need to inspect the sign bit.

\subsubsection{Arithmetic Negation}

Arithmetic negation of Two's Complement numbers is done by logically
negating all the bits and then incrementing by one:

\begin{definition}[Negating a Two's Complement number]
  \[
    \mathrm{negTC}(\bitvector{x_{w-1}\cdots{}x_{0}}) \defeq \bitvector{\neg{}x_{w-1}\cdots{}\neg{}x_{0}} \BitAdd \bitvector{0\cdots{}1}
  \]
\end{definition}

However, because the range of Two's Complement is asymmetric---there
are more negative than positive numbers---negation of the most
negative number is an identity operation.

\begin{example}[Overflow when negating]
  \begin{align}
    \mathrm{negTC}(\bitvector{1000})
    &= \bitvector{\neg 1 \neg 0 \neg 0 \neg 0} \BitAdd \bitvector{0001} \\
    &= \bitvector{0 1 1 1} \BitAdd \bitvector{0001} \\
    &= \bitvector{1 0 0 0}
  \end{align}
\end{example}

Nevertheless, the definition of negation is sufficient for us to
define subtraction of numbers in Two's Complement:

\begin{definition}[Integer subtraction in Two's Complement]
  \begin{align*}
    x \BitSub y \defeq x \BitAdd \mathrm{negTC}(y)
  \end{align*}
  \label{def:intsub}
\end{definition}

\subsubsection{Arithmetic right shift}
\label{sec:arithmetic-right-shift}

In \cref{sec:bitshift} we saw how shifting by $k$ corresponds to
multiplication and division by $2^{k}$.  But while left-shifting works
equivalently for unsigned and Two's Complement, right shifting a
negative number tends not to produce the arithmetically correct result.

\begin{example}[Logically right-shifting Two's Complement numbers]
  \begin{align}
    \mathrm{Z2TC}(4) \BitSrl 1 &&= \bitvector{0100} \BitSrl 1 &&= \bitvector{0010} &&= \mathrm{Z2TC}(2) \\
    \mathrm{Z2TC}(-4) \BitSrl 1 &&= \bitvector{1100} \BitSrl 1 &&= \bitvector{0110} &&= \mathrm{Z2TC}(6)
  \end{align}
\end{example}

We can consider an unsigned number as consisting of an arbitrary number
of $0$ bits to the left of its most significant bit, in the same
manner we can write decimal numbers with as many leading zeroes as we
desire.  When we right-shift, it is these zeroes that are inserted.
Following this analogy, a negative Two's Complement number could be
viewed as having an arbitrary number of $1$ bits to the left of
it---or more generally, copies of the sign bit.  When we right-shift,
it is \emph{these} that should be inserted.

To address this issue, we define a new kind of right-shift where we do
not prepend zeroes, but instead copies of the sign bit.  This is
called an \emph{arithmetic right shift}.

\begin{definition}[Arithmetic right-shift by $k$ bits]
  \begin{align*}
    \bitvector{x_{w-1}\cdots{}x_{0}} \BitSra k \defeq
    \bitvector{\underbrace{x_{w-1} \cdots x_{w-1}}_{k} x_{w-1}\cdots{}x_{k}}
  \end{align*}
  \label{def:bits-rightshift-arithmetic}
\end{definition}

\begin{example}{Arithmetically right-shifting negative numbers}
  \begin{align}
    \mathrm{Z2TC}(4) \BitSra 1 &&= \bitvector{0100} \BitSra 1 &&= \bitvector{0010} &&= \mathrm{Z2TC}(2) \\
    \mathrm{Z2TC}(-4) \BitSra 1 &&= \bitvector{1100} \BitSra 1 &&= \bitvector{1110} &&= \mathrm{Z2TC}(-2)
  \end{align}
\end{example}

There is no need to define an arithmetic \emph{left} shift, as
left-shifting already behaves as desired.  In programming languages
that distinguish between signed and unsigned numbers in their type
system, such as C, right-shifting a signed number performs an
arithmetic shift, and right-shifting an unsigned number performs a
logical shift.

\subsection{Sign Extension}
\label{sec:sign-extension}

In \cref{sec:bit-addition} we saw that a $w$-bit unsigned number can
be extended to a $w+k$-bit number by prepending zeroes.  Such
\emph{zero extension} can however change the numeric interpretation of
a Two's Complement number, for reasons similar to the troubles we had
with right-shifting.

\begin{example}[Zero extension may change value]
  \begin{align}
    \mathrm{TC2Int}(\bitvector{1100}) &= -4 \\
    \mathrm{TC2Int}(\bitvector{00001100}) &= 12 \\
    \mathrm{TC2Int}(\bitvector{0100}) &= 4 \\
    \mathrm{TC2Int}(\bitvector{00000100}) &= 4
  \end{align}
\end{example}

To extend a $w$-bit word to a $w-k$-bit word while preserving its
Two's Complement numeric value, we perform \emph{sign extension} where
we prepend copies of the sign bit.

\begin{example}[Sign extension preserves value]
  \begin{align}
    \mathrm{TC2Int}(\bitvector{1100}) &= -4 \\
    \mathrm{TC2Int}(\bitvector{11111100}) &= -4 \\
    \mathrm{TC2Int}(\bitvector{0100}) &= 4 \\
    \mathrm{TC2Int}(\bitvector{00000100}) &= 4
  \end{align}
\end{example}

%%% Local Variables:
%%% mode: latex
%%% TeX-master: "hpps-notes"
%%% End: