by Julian M Bucknall

I came across this remarkable image a couple of days ago. It shows four million digits of π encoded as colored pixels, with one pixel per digit. Each digit has a slightly different color.

The page also allows you to search for digit strings up to six characters long and the page will find them in the image, if, of course, the string is there.

This raises a couple of questions. The first one is this: given that π expressed as a decimal is infinitely long, can you find any digit string in it? (Since π is not a *rational number* – that is, a number that can be expressed as *a*/*b* for two integers *a*, *b* – it has to be infinitely long.) Let’s say you encode “To be or not to be” or anything else as decimal values, maybe using ASCII. Can you find that sequence of digits in π somewhere? It turns out that we don’t know. At least not for sure. A number with that property (that any finite sequence of digits can be found in the number’s infinite decimal representation) is known as a *normal number*. Although mathematicians believe that numbers like π and *e* and √2 are normal, no one has yet proved it.

The next question is this: what is the probability of finding a given digit string in those four million digits at least once? It turns out that answering this is a little bit like explaining the Birthday Paradox (the paradox that in a room of 23 randomly chosen people, the chances of two of them having the same birthday is about 0.5).

Let’s take it slowly. Suppose our string is one digit long, say “7”. What is the probability of finding it at the first position in π? That’s easy: 0.1. At the second position? it’s the same: 0.1. In fact it’s the same throughout the four million digits of π: the chance of finding “7” at *any* position is 0.1. Now suppose our digit string is two digits long, say 42. What’s the probability of finding it at the first position? Well, since there are 100 possible two-digit strings (“00” to “99”), the probability must be 0.01. The second position? The same: 0.01. Ditto for all the other positions in the expansion of π: the probability of finding 42 is 0.01 at any position. I’m sure you can see the pattern and work out the probability of finding a three-digit string (0.001), a four-digit string (0.0001), and so on. In essence, if there are *d* digits, the probability of finding the given string at each and every position is \(0.1^d\).

However, that doesn’t really help us answer the original question: what is the probability of finding a given string at least once? Let’s invert the problem: if we know that the probability of something happening is *p* then the probability of it *not* happening is \(1-p\). So all we need to do is to calculate the probability of *not* finding the given string at all in those four million digits of π, and then our answer is going to be 1 minus that. Brilliant, he says snarkily.

Let’s suppose our given string is “123”. We know that the probability of finding it at position 1 is 0.001. The probability of *not* finding it there is therefore 0.999. The same goes for position 2: the probability of *not* finding it there is 0.999. Ditto for every other position. So the overall probability of all of these positions *not* containing “123” is \( (0.999 \times 0.999 \times 0.999 \times \dots) \) four million times. Or, more concisely, \(0.999^{4,000,000}\). This is equal to 0, at least to the resolution of my scientific calculator. So the probability of this being false, that “123” appears at least once in those four million digits, is 1. Phew.

To generalize, the probability of finding a given *d* digit string in those four million digits of π is going to be this:

\[1-(1-0.1^d)^{4,000,000}\]

For fun, I did a quick set of calculations to show the probabilities of finding strings for various string lengths in those four million digits (OK, Wolfram Alpha did the calculations, but I’m taking the glory). For one to five digits, the probability of finding a given string of that length in four million digits is, for all intents and purposes, 1.0. It’s a certainty, in other words. For six digits, the probability is 0.98; that is, one chance in fifty of not finding a given string. For seven digits, it’s 0.33; two chances in three of not finding it. For eight digits, the probability of finding a given string is 0.04, or one chance in 25. For nine digits it’s one chance in 250.

We now know 10 trillion digits of π. Using the above, I’m sure you can now calculate the probability of finding, say, a given 15-digit string in those 10 trillion digits. (It’s 0.01 or just one chance in 100.)

Loading links to posts on similar topics...

## 1 Response

## #1 Van Swofford said...

15-Jun-12 1:49 PMI've wondered if it would be possible to store almost anything as 2 integers, an offset into pi and a length. If it were, it would be pretty cool to have War and Peace stored as just two integers, and you could reconstitute it with an algorithm. Sounds like that's not likely to be possible, or at least not until some point in the future when pi is known to trillions of digits and the computation power is ubiquitous to do the encoding and decoding. May as well just have a gigantic hard disk to store all that stuff.... :^)

## Leave a response

Note: some MarkDown is allowed, but HTML is not. Expand to show what's available.

`_emphasis_`

`**strong**`

`[text](url)`

``IEnumerable``

`* an item`

`1. an item`

`> Now is the time...`

## Preview of response