1.1 Counting prime numbers

As the reader surely knows, prime numbers are those that have no positive divisors except 1 and the number itself. The special significance of prime numbers is due to the following fact, which we will assume known:

Every positive integer is expressible as a product of primes. The expression is unique if the primes are listed in increasing order.

Effectively, this means that the primes are the basic atoms" in the multipicative system of integers. (If is itself prime, we are regarding it as a product" of one prime, itself.)

The first result on the number of primes was already known to Euclid. Here it is, with Euclid's beautiful proof.

Proposition 1.1.1. There are infinitely many prime numbers.

Proof. Choose finitely many primes . We will show that they cannot constitute the total set of primes. Consider the number

Then is not a multiple of any , since it clearly leaves remainder 1 when divided by . However, by the above statement, is expressible as a product of primes. Let be any one of these. Then is a further prime, different from all the , which therefore indeed fail to constitute the total set of primes.

(Note: This reasoning actually shows a bit more: if the primes are listed as in increasing order, then  .)

With this settled, it is natural to ask how many prime numbers there are up to any given number. This is the topic of our study. Let us give it some notation:

This notation is standard in number theory; there is no real danger of confusion with the number . It will suit our purposes to regard as a function of a real variable . As such a function, it is, of course, constant between primes and jumps by 1 at each prime.

The first impression given by the sequence of primes

is one of extreme irregularity. There are bunches, gaps and relatively uniform stretches. It would appear to be a daunting task to find a simple expression that approximates to for all large enough . The only simple observation is that the primes tend to become more sparse as one goes on. However, an examination of the numerical values of suggests that a reasonable approximation is given by  , a considerably better one by

and a still better one by the logarithmic integral", defined as follows:

Some of these numerical values are as follows (given to the nearest integer).

By the year 1800, long before the age of computers, mathematicians had performed the remarkable feat of calculating these figures by hand up to . In the age of computers, it has of course become much easier to calculate values of . Some readers will be interested in doing so on their own computer: various methods for this are discussed in Appendix F.

Let us formulate precisely the conjecture suggested by these figures. Given two functions , , both tending to infinity as , we write

to mean that

Our conjecture is the statement

In fact, as we will show in section LI,

so at this level it is equivalent to state the conjecture using any of the three functions.

The conjecture is in fact true. The statement is called the prime number theorem. It is indisputably one of the most celebrated theorems in mathematics. Ways of proving it, together with related results, more precise versions and generalizations, form the subject of this book. Of course, numerical evidence of the above type can never constitute a proof of the general statement.

An informal interpretation of the theorem is that the average density" of primes around a large number approximates to , or that the probability" of being prime is (in some sense) .

Let us return to the historical trail. Legendre, in 1798, postulated the approximations and  . He suggested (wrongly) that an even better approximation would be given by , with . Meanwhile, Gauss proposed li. It seems that Gauss recorded his conjecture around 1793 (at the age of 14 !), but did not communicate it to anyone until 1849.

The search for a proof remained one of the main areas of mathematical endeavour during the rest of the 19th century. In 1850, a giant stride was made by Chebyshev, who showed, by essentially number-theoretic methods, that there are constants (not very far from 1) such that

for all large enough . However, no refinement of his methods seemed to offer any hope of proving the desired limit.

A completely different approach was proposed by Riemann in 1859. His starting point was a remarkable identity already discovered by Euler in 1737, expressing the zeta function"

as an infinite product involving the primes. Riemann considered this as a function of a complex variable , defined by the above formula for Re . He showed how to extend the definition of the function to the rest of the complex plane, and outlined a programme showing how, if certain properties of the extended zeta function could be established, the prime number theorem would follow. His paper was a bold imaginitive leap; it was hardly an obvious idea to use the theorems of complex analysis to count prime numbers! However, Riemann was not able to justify all his steps, and one of them, the Riemann hypothesis", has remained unsolved to this day, regarded by many as the most important unsolved problem in mathematics.

It was not until 1896 that Riemann's programme was successfully completed. It was then done so independently by the French mathematician Jacques Hadamard and the Belgian Charles-Jean de la Vallée Poussin. They were able to by-pass the Riemann hypothesis and establish other properties of the zeta function that were sufficient for the purpose. Hadamard lived until 1963 (aged 97) and de la Vallée Poussin until 1962 (aged 95): their mathematical labours cannot have done any harm to their health!

Further variations and modifications of their methods were developed by Mertens, Landau and others, but to this day the simplest, and most powerful, proofs of the prime number theorem rely on the zeta function and complex analysis, as suggested by Riemann. In chapters 1-3, we present a version (and a variant) which benefits from a century of tidying up", but which still recognisably owes its existence to Riemann.

After the successful outcome of Riemann's programme, it remained a matter of great interest to ask whether the theorem could after all be proved by number-theoretic methods, without complex analysis. Some leading mathematicians expressed strong doubts that this could be done. However, it was eventually achieved in 1949, again independently by two people, A. Selberg and P. Erdös. Proofs of this sort are called elementary" as opposed to analytic". However, elementary" does not mean simple"! Half a century later, known proofs of this sort are still more complicated than analytic ones, and less successful in providing error estimates or in delivering other theorems of the same sort. We present a version of the elementary" proof in chapter 6.

Exercises

1. Let and let . For which values of is not a multiple of 2,3 or 5 ? By considering the possible positions of multiples of 7, show that contains at most seven primes (seven cases, no short cuts!).

2. Show that for any , there is eventually a gap of length at least between successive primes. [Hint: consider or .]

3. Let the primes be listed, in order, as . Use Euclid's proof to show by induction that   for each . Deduce that