Oy Veh! Power Series, Analytic Continuations, and Riemann Zeta

After the whole Plait fiasco with the sum of the infinite series of natural numbers, I decided it would interesting to dig into the real math behind that mess. That means digging in to the Riemann function, and the concept of analytic continuation.

A couple of caveats before I start:

1. this is the area of math where I'm at my worst. I am not good at analysis. I'm struggling to understand this stuff well enough to explain it. If I screw up, please let me know in the comments, and I'll do my best to update the main post promptly.
2. This is way more complicated than most of the stuff I write on this blog. Please be patient, and try not to get bogged down. I'm doing my best to take something that requires a whole lot of specialized knowledge, and explain it as simply as I can.

What I'm trying to do here is to get rid of some of the mystery surrounding this kind of thing. When people think about math, they frequently get scared. They say things like "Math is hard, I can't hope to understand it.", or "Math produces weird results that make no sense, and there's no point in my trying to figure out what it means, because if I do, my brain will explode. Only a super-genius geek can hope to understand it!"

That's all rubbish.

Math is complicated, because it covers a whole lot of subjects. To understand the details of a particular branch of math takes a lot of work, because it takes a lot of special domain knowledge. But it's not fundamentally different from many other things.

I'm a professional software engineer. I did my PhD in computer science, specializing in programming languages and compiler design. Designing and building a compiler is hard. To be able to do it well and understand everything that it does takes years of study and work. But anyone should be able to understand the basic concepts of what it does, and what the problems are.

I've got friends who are obsessed with baseball. They talk about ERAs, DIERAs, DRSs, EQAs, PECOTAs, Pythagorean expectations, secondary averages, UZRs... To me, it's a huge pile of gobbledygook. It's complicated, and to understand what any of it means takes some kind of specialized knowledge. For example, I looked up one of the terms I saw in an article by a baseball fan: "Peripheral ERA is the expected earned run average taking into account park-adjusted hits, walks, strikeouts, and home runs allowed. Unlike Voros McCracken's DIPS, hits allowed are included." I have no idea what that means. But it seems like everyone who loves baseball - including people who think that they can't do their own income tax return because they don't understand how to compute percentages - understand that stuff. They care about it, and since it means something in a field that they care about, they learn it. It's not beyond their ability to understand - it just takes some background to be able to make sense of it. Without that background, someone like me feels lost and clueless.

That's the way that math is. When you go to look at a result from complex analysis without knowing what complex analysis is, it looks like terrifyingly complicated nonsensical garbage, like "A meromorphic function is a function on an open subset of the complex number plain which is holomorphic on its domain except at a set of isolated points where it must have a Laurent series".

And it's definitely not easy. But understanding, in a very rough sense, what's going on and what it means is not impossible, even if you're not a mathematician.

Anyway, what the heck is the Riemann zeta function?

It's not easy to give even the simplest answer of that in a meaningful way.

Basically, Riemann Zeta is a function which describes fundamental properties of the prime numbers, and therefore of our entire number system. You can use the Riemann Zeta to prove that there's no largest prime number; you can use it to talk about the expected frequency of prime numbers. It occurs in various forms all over the place, because it's fundamentally tied to the structure of the realm of numbers.

The starting point for defining it is a power series defined over the complex numbers (note that the parameter we use is instead of a more conventional : this is a way of highlighting the fact that this is a function over the complex numbers, not over the reals).

This function is not the Riemann function!

The Riemann function is something called the analytic continuation of . We'll get to that in a moment. Before doing that; why the heck should we care? I said it talks about the structure of numbers and primes, but how?

The zeta function actually has a lot of meaning. It tells us something fundamental about properties of the system of real numbers - in particular, about the properties of prime numbers. Euler proved that Zeta is deeply connected to the prime numbers, using something called Euler's identity. Euler's identity says that for all integer values:

Which is a way of saying that the Riemann function can describe the probability distribution of the prime numbers.

To really understand the Riemann Zeta, you need to know how to do analytic continuation. And to understand that, you need to learn a lot of number theory and a lot of math from the specialized field called complex analysis. But we can describe the basic concept without getting that far into the specialized stuff.

What is an analytical continuation? This is where things get really sticky. Basically, there are places where there's one way of solving a problem which produces a diverging infinite series. When that happens you say there's no solution, that thepoint where you're trying to solve it isn't in the domain of the problem. But if you solve it in a different way, you can find a way of getting a solution that works. You're using an analytic process to extend the domain of the problem, and get a solution at a point where the traditional way of solving it wouldn't work.

A nice way to explain what I mean by that requires taking a
diversion, and looking at a metaphor. What we're talking about here isn't analytical continuation; it's a different way of extending the domain of a function, this time in the realm of the real numbers. But as an example, it illustrates the concept of finding a way to get the value of a function in a place where it doesn't seem to be defined.

In math, we like to play with limits. One example of that is in differential calculus. What we do in differential
calculus is look at continuous curves, and ask: at one specific location on the curve, what's the slope?

If you've got a line, the slope is easy to determine. Take any two points on the line: , where . Then the slope is . It's easy, because for a line, the slope never changes.

If you're looking at a curve more complex than line, then slopes get harder, because they're constantly changing. If you're looking at , and you zoom in and look at it very close to , it looks like the slope is very close to 0. If you look at it close to 1, it looks like it's around 2. If you look at it at x=10, it looks a bit more than 20. But there are no two points where it's exactly the same!

So how can you talk about the slope at a particular point ? By using a limit. You pick a point really close to , and call it . Then an approximate value of the slope at is:

The smaller epsilon gets, the closer your approximation gets. But you can't actually get to , because if you did, that slope equation would have 0 in the denominator, and it wouldn't be defined! But it is defined for all non-zero values of . No matter how small, no matter how close to zero, the slope is defined. But at zero, it's no good: it's undefined.

So we take a limit. As gets smaller and smaller, the slope gets closer and closer to some value. So we say that the slope at the point - at the exact place where the denominator of that fraction becomes zero - is defined as:

(Note: the original version of the previous line had a missing "-". Thanks to commenter Thinkeye for catching it.)

Since is getting closer and closer to zero, is getting smaller much faster; so we can treat it as zero:

So at any point , the slope of is . Even though computing that involves dividing by zero, we've used an analytical method to come up with a meaningful and useful value at . This doesn't mean that you can divide by zero. You cannot conclude that . But for this particular analytical setting, you can come up with a meaningful solution to a problem that involves, in some sense, dividing by zero.

The limit trick in differential calculus is not analytic continuation. But it's got a tiny bit of the flavor.

Moving on: the idea of analytic continuation comes from the field of complex analysis. Complex analysis studies a particular class of functions in the complex number plane. It's not one of the easier branches of mathematics, but it's extremely useful. Complex analytic functions show up all over the place in physics and engineering.

In complex analysis, people focus on a particular group of functions that are called analytic, holomorphic, and meromorphic. (Those three are closely related, but not synonymous.).

A holomorphic function is a function over complex variables, which has one
important property. The property is almost like a kind of abstract smoothness. In the simplest case, suppose that we have a complex equation in a single variable, and the domain of this function is . Then it's holomorphic if, and only if, for every point , the function is complex differentiable in some neighborhood of points around .

(Differentiable means, roughly, that using a trick like the one we did above, we can take the slope (the derivative) around . In the complex number system, "differentiable" is a much stronger condition than it would be in the reals. In the complex realm, if something is differentiable, then it is infinitely differentiable. In other words, given a complex equation, if it's differentiable, that means that I can create a curve describing its slope. That curve, in turn, will also be differentiable, meaning that you can derive an equation for its slope. And that curve will be differentiable. Over and over, forever: the derivative of a differentiable curve in the complex number plane will always be differentiable.)

If you have a differentiable curve in the complex number plane, it's got one really interesting property: it's representable as a power series. (This property is what it means for a function to be called analytic; all holomorphic functions are analytic.) That is, a function is holomorphic for a set if, for all points , you can represent the value of the function as a power series for a disk of values around :

In the simplest case, the constant is 0, and it's just:

(Note: In the original version of this post, I miswrote the basic pattern of a power series, and put both and in the base. Thanks to John Armstrong for catching it.)

The function that we wrote, above, for the base of the zeta function is exactly this kind of power series. Zeta is an analytic function for a particular set of values. Not all values in the complex number plane; just for a specific subset.

If a function is holomorphic, then the strong differentiability of it leads to another property. There's a unique extension to it that expands its domain. The expansion always produces the same value for all points that are within the domain of . It also produces exactly the same differentiability properties. But it's also defined on a larger domain than was. It's essentially what would be if its domain weren't so limited. If is the domain of , then for any given domain, , where , there's exactly one function with domain that's an analytic continuation of .

Computing analytic continuations is not easy. This is heavy enough already, without getting into the details. But the important thing to understand is that if we've got a function with an interesting set of properties, we've got a method that might be able to give us a new function that:

1. Everywhere that was defined, .
2. Everywhere that was differentiable, is also differentiable.
3. Everywhere that could be computed as a sum of an infinite power series, can also be computed as a sum of an infinite power series.
4. is defined in places where and the power series for is not.

So, getting back to the Riemann Zeta function: we don't have a proper closed form equation for zeta. What we have is the power series of the function that zeta is the analytic continuation of:

If , then the series for that function expands to:

The power series is undefined at this point; the base function that we're using, that zeta is the analytic continuation of, is undefined at . The power series is an approximation of the zeta function, which works over some specific range of values. But it's a flawed approximation. It's wrong about what happens at . The approximation says that value at should be a non-converging infinite sum. It's wrong about that. The Riemann zeta function is defined at that point, even though the power series is not. If we use a different method for computing the value of the zeta function at - a method that doesn't produce an incorrect result! - the zeta function has the value at .

Note that this is a very different statement from saying that the sum of that power series is at . We're talking about fundamentally different functions! The Riemann zeta function at does not expand to the power series that we used to approximate it.

In physics, if you're working with some kind of system that's described by a power series, you can come across the power series that produces the sequence that looks like the sum of the natural numbers. If you do, and if you're working in the complex number plane, and you're working in a domain where that power series occurs, what you're actually using isn't really the power series - you're playing with the analytic zeta function, and that power series is a flawed approximation. It works most of the time, but if you use it in the wrong place, where that approximation doesn't work, you'll see the sum of the natural numbers. In that case, you get rid of that sum, and replace it with the correct value of the actual analytic function, not with the incorrect value of applying the power series where it won't work.

Ok, so that warning at the top of the post? Entirely justified. I screwed up a fair bit at the end. The series that defines the value of the zeta function for some values, the series for which the Riemann zeta is the analytical continuation? It's not a power series. It's a series alright, but not a power series, and not the particular kind of series that defines a holomorphic or analytical function.

The underlying point, though, is still the same. That series (not power series, but series) is a partial definition of the Riemann zeta function. It's got a limited domain, where the Riemann zeta's domain doesn't have the same limits. The series definition still doesn't work at . The series is still undefined at . At , the series expands to , which doesn't converge, and which doesn't add up to any finite value, -1/12 or otherwise. That series does not have a value at . No matter what you do, that equation - the definition of that series - does not work at . But the Riemann Zeta function is defined in places where that equation isn't. Riemann Zeta at is defined, and its value is .

Despite my mistake, the important point is still that last sentence. The value of the Riemann zeta function at is not the sum of the set of natural numbers. The equation that produces the sequence doesn't work at . The definition of the Riemann zeta function doesn't say that it should, or that the sum of the natural numbers is . It just says that the first approximation of the Riemann zeta function for some, but not all values, is given by a particular infinite sum. In the places where that sum works, it gives the value of zeta; in places where that sum doesn't work, it doesn't.

Hensel's Lemma: Newton's Method for p-Adic numbers

This is, sadly, the last part of my posts about P-adic arithmetic. It's not for lack of information, but rather for lack of competence on my part. The area of math that I'm worst at is analysis, and an awful lot of the interesting things that you can do with p-adic numbers is analysis.

For an explanation of what p-adic numbers are, and how arithmetic on them works, you can refer to my first post on the p-adics, and for an explanation of p-adic metrics and distance, you can look at this post.

Today, we're going to look at Hensel's lemma. This is really cool. Hensel's lemma shows you a simple way of finding polynomial roots in P-adic arithmetic. It's sometimes called Newton's method for p-adics, which is both a good description of how it's used (but a complete misnomer because the lemma is a description of of a property, and Newton's method is a description of a procedure).

Hensel's lemma says that if you've got a polynomial, and for a prime number p, you can find a root using modulo arithmetic using base p, then you can use the base-p root to find the corresponding root using modulo base p2, and given the root using base p2, you can find it for modulo p3, and so on!

So far, this is just a statement about modulo arithmetic - not about p-adic arithmetic. But what ends up happening is that the result modulo p gives you a first approximation of the root in p-adic. Using the process defined by Hensel's lemma, you can take that root and extend it to a better approximation. It ends up being the case that each time you lift the result to a higher exponent of , you're performing the exactly analog of one step in Newton's method of finding roots! But it's actually better than that. In you look at Hensel's lemma in the p-adic numbers, then each step extends the approximation by exactly one digit!

Let's look at the lemma in formal terms:

Suppose that you have a polynomial .

If there is an integer, and a prime number such that for in mod , such that in mod . In general, if there's an exponent where in mod , then there's a where in mod .

In fact, we can do even better than that! If we know the root for an exponent , then we can easily compute using a process called lifting:

(In this, is the first derivative of at .)

You can see, looking at that equation, that the derivative of at can't be zero mod , or else the denominator of the fraction would be zero, and everything would go kablooie in your face!

In P-adic arithemetic, this becomes absolutely amazing. If is the root mod , then the root mod is:

It can't be simpler. And you can clearly see the relation to Newton's method.

For fun, let's try looking at an example: let's take the square root of 2 in 7-adic. In base-10, the square root of 2 is roughly 1.4142135624. In p-adic, it's going to bo something completely different. We'll start by phrasing it as a polynomial: . So, what's the solution to that mod-7? 3: in mod-7, 3*3=2.

So, our first approximation of a square root of 2 is 3.

We can extend to the next digit using Hensel's lemma, and we get . We're looking at mod 7 arithmetic here, so 6 = -1. So .

Repeated, we'd get 213, 6213, and so on.

Define Distance Differently: the P-adic norm

As usual, sorry for the delay. Most of the time when I write this blog, I'm writing about stuff that I know about. I'm learning about the p-adic numbers as I write these posts, so it's a lot more work for me. But I think many of you will be happy to hear about the other reason for the delay: I've been working on a book based on some of my favorite posts from the history of this blog. It's nearly ready! I'll have more news about getting pre-release versions of it later this week!

But now, back to p-adic numbers!

In the last post, we looked a bit at the definition of the p-adic numbers, and how p-adic arithmetic works. But what makes p-adic numbers really interesting and valuable is metrics.

Metrics are one of those ideas that are simultaneously simple and astonishingly complicated. The basic concept of a metric is straightforward: I've got two numbers, and I want to know how far apart they are. But it becomes complicated because it turns out that there are many different ways of defining metrics. We tend to think of metrics in terms of euclidean geometry and distance - the words that I to describe metrics "how far apart" come from our geometric intuition. In math, though, you can't ever just rely on intuition: you need to be able to define things precisely. And precisely defining a metric is difficult. It's also fascinating: you can create the real numbers from the integers and rationals by defining a metric, and the metric will reveal the gaps between the rationals. Completing the metric - filling in those gaps - gives you the real numbers. Or, in fact, if you fill them in differently, the p-adic numbers.

To define just what a metric is, we need to start with fields and norms. A field is an abstract algebraic structure which describes the behavior of numbers. It's an abstract way of talking about the basic structure of numbers with addition and multiplication operations. I've talked about fields before, most recently when I was debunking the crackpottery of E. E. Escultura here.

A norm is a generalization of the concept of absolute value. If you've got a field , then a norm of is a function, the norm of is a function from values in to non-negative numbers.

1. if and only if .

A norm on can be used to define a distance metric between and in as .

For example, the absolute value is clearly a norm over the real numbers, and it defines the euclidean distance between them.

So where do the gaps come from?

You can define a sequence of values in as
for some set of values . There's a special kind of sequence called
a Cauchy sequence, which is a sequence where .

You can show that any Cauchy sequence converges to a real number. But even if every element of a Cauchy sequence is a rational number, it's pretty easy to show that many (in fact, most!) Cauchy sequences do not converge to rational numbers. There's something in between the rational numbers which Cauchy sequences of rational numbers can converge to, but it's not a rational number. When we talk about the gaps in the rational numbers, that's what we mean. (Yes, I'm hand-waving a bit, but getting into the details
would be a distraction, and this is the basic idea!)

When you're playing with number fields, the fundamental choice that you get is just how to fill in those gaps. If you fill them in using a metric based on a Euclidean norm, you get the real numbers. What makes the p-adic numbers is just a different norm, which defines a different metric.

The idea of the p-adic metric is that there's another way of describing the distance between numbers. We're used to thinking about distance measured like a ruler on a numberline, which is what gives us the reals. For the p-adics, we're going to define distance in a different way, based on the structure of numbers. The way that the p-adic metric works is based on how a number is built relative to the prime-number base.

We define the p-adic metric in terms of the p-adic norm exactly the way that we defined Euclidean distance in terms of the absolute value norm. For the p-adic number, we start off with a norm on the integers, and then generalize it. In the P-adic integers, the norm of a number is based around the largest power of the base that's a factor of that number: for an integer , if is the largest power of that's a factor of , then the the p-adic norm of (written ) is . So the more times you multiply a number by the p-adic base, the smaller the p-adic norm of that number is.

The way we apply that to the rationals is to extend the definition of p-factoring: if is our p-adic base, then we can define the p-adic norm of a rational number as:

• For other rational numbers : where:
• If is a natural number, then is the exponent of the largest power of that divides .
• If is a rational number , then .

Another way of saying that is based on a property of rational numbers and primes. For any prime number , you can take any rational number , and represent it as a p-based ratio , where neither nor is divisible by . That representation is unique - there's only one possible set of values for , , and where that's true. In that case, p-adic norm of ,
.

Ok, that's a nice definition, but what on earth does it mean?

Two p-adic numbers and are close together if is divisible by a large power of .

In effect, this is the exact opposite of what we're used to. In the real numbers written out in decimal for as a series of digits, the metric says that the more digits numbers have in common moving from left to right, the closer together they are. So 9999 and 9998 are closer than 9999 and 9988.

But with P-adic numbers, it doesn't work that way. The P-adic numbers are closer together if, moving right to left, they have a common prefix. The distance ends up looking very strange. In 7-adic, the distance between 56666 and 66666 is smaller than the distance between 66665 and 66666!

As strange as it looks, it does make a peculiar kind of sense. The p-adic distance is measuring a valuable and meaningful kind of distance between numbers - their distance in terms of
their relationship to the base prime number p. That leads to a lot of interesting stuff, much of which is, to be honest, well beyond my comprehension! For example, the Wiles proof of Fermat's last theorem uses properties of the P-adic metric!

Without getting into anything as hairy as FLT, there are still ways of seeing why the p-adic metric is valuable. Next post, we'll look at something called Hensel's lemma, which both shows how something like Newton's method for root-finding works in the p-adic numbers, and also shows some nice algebraic properties of root-finding that aren't nearly as clear for the real numbers.

I've been having a pretty rough month. The day after thanksgiving,
I was diagnosed with shingles. Shingles is painful - very, very painful. Just the friction of clothing brushing over the shingles when I move caused so much pain that I spent weeks being as immobile as possible.

Needless to say, this has been extremely boring. When I wasn't fuzzed out on pain-killers, I've been doing some reading lately on something called p-adic numbers. p-adics are a very strange kind of number: They're strange-looking, strange-acting, and strangely useful.

P-adics are an alternative to the real numbers. In fact, in a way, they are the alternative to real numbers.

If you go back to number theory, there's a reason why the rational numbers aren't sufficient, and why we have to extend them to the reals by adding irrational numbers. If you take all of the possible rational numbers, you find that there are gaps - places where we know that there must be a number, and yet, if we limit ourselves to fractions, there isn't anything to fit. For example, we know that there must be some number where if we multiply it by itself, it's equal to 2. It's more than 1 4/10ths, by about 1/100th. But it's more than 141/100ths, but about 4/1,000ths. It's more that 1,414/1,000s, by about 2/10,000ths. No matter how far we go, there's no fraction that's exactly right. But we know that there's a number there! If we look at those gaps carefully, we'll find that most numbers are actually in those gaps! The real numbers and the p-adics are both ways of creating number systems that allow us to define the numbers that fit in to those gaps.

The easiest way to understand p-adic numbers is think about how we
represent real numbers in base-P. In real numbers, we represent them using the powers of the base. So, for example, we'd write 24.31 in base-5 to mean 2*51 + 4*50 + 3*5-1 + 1*5-2. Using our normal notation for real numbers, we fill in the gaps by writing numbers as an integer part, followed by a decimal point, followed by a fractional part. For real numbers that aren't rationals, we say that the fractional part goes on forever. So the square root of two starts 1.4142135624, and continues on and on, forever, without repeating. That gives us the real numbers. In that system, every number can be written using a finite number of digits to the left of the decimal point, and an infinite number of digits to the right.

P-adic numbers are exactly the opposite: every P-adic number has a finite number of digits to the right of the decimal, but it can have an infinite number to the left!

Defining p-adic numbers starts off being pretty similar to how we compute the representation of numbers in a standard numeric base. Take a number for your base, like 10. For a number n in base 10, take n modulo 10. That's the right-most digit of your number. Divide by ten, dropping the fractional part, and take the result modulo 10, and that's the second-rightmost digit. Continue this until there's nothing left.

For example, take the number 222 in base-10. If we wanted to represent that in base-7, we'd do:

1. If we divide 222 by 7, we get 31 with a remainder of 5. So the rightmost digit is 5.
2. We take 31, and divide it by 7, giving us 4, with a remainder of 3. So the second digit is 3.
3. We're left with 4, so the last digit is 4.

So - 222 in base-7 is 435. It's the same in 7-adic. For a particular base B, all positive integers are written the same in both base-B and B-adic. Integer arithmetic that doesn't involve negative numbers is also the same.

There's one reallybig catch, and it's one that would make a lot of cranks (like E. E. Escultura) very happy: for real numbers, the decimal notation (for example) is a just a representation - 35 in base 10 is written differently from 43 in base 8 but they're the same actual number, just written differently. 35 in 10-adic is not the same number as 43 in 8-adic. As we'll see when we get to metrics, they're quite different. Each p-adic base produces a distinct system of p-adic numbers, and you can't convert between them as freely as you can in the conventional reals. Decimal notation and hexidecimal notation are writing numbers in the same number system; 2-adic and 3-adic are different number systems!

The first essential difference between p-adic numbers and the reals comes when you try to do arithmetic.

As I said above, for integers, if you don't
do anything with negative numbers, P-adic arithmetic is the same as real number integer arithmetic. In fact, if you're working with P-adic numbers, there are no negative numbers at all! In a P-adic number system, subtracting 1 from 0 doesn't give you -1. It "rolls over" like an infinite odometer. So for 7-adic, 0-1 = ...666666666! That means that arithmetic gets a bit different. Actually, it's really pretty easy: you just do arithmetic from the right to the left, exactly the way that you're used to.

For addition and subtraction, P-adic works almost like normal real-number arithmetic using decimals. Addition is basically what you know from decimal arithmetic. Just go right to left, adding digits, and carrying to the left.

So, for example, in 5-adic, if you have a number ...33333, and 24, to add them, you'd go through the following steps.

1. 3 + 4 is 7, which is 12 in base-5. So the first digit of the sum is 2, and we carry 1.
2. 3 + 2 is 6, plus the carried one is 6 - so again, 12 in base-5. So the second digit is also 2, and we carry 1.
3. 3 + 0 is 3, plus the carried one is 4, se the third digit is 4.
4. For all the rest of the infinite digits streaming off to the left, it's 3+0=3.

So the sum is ...3333422.

To do subtraction, it's still basically the same as what you're used to from decimal. There's just one simple change: infinite borrowing. In normal subtraction, you can borrow from the position to your left if there's anything to your left to borrow from. For example, in decimal, if you wanted to subtract 9 from 23, you'd borrow 10 from the 2, then subtract 9 from 13, giving you a result of 14. But if you wanted to substract 3-9, you couldn't borrow, because there's nothing to the left to borrow from. In p-adic, you can always borrow. If you're subtracting 3-9 in 10-adic, then you can borrow from the 0 to the left of the 3. Of course, there's nothing there - so it needs to borrow from its left. And so on - giving you an infinite string of 9s. So 3-9 in 10-adic gives you ....999999994.

Let's do a full example: ...33333 - 42 in 5-adic.

1. As always, we start from the right. 3 - 2 = 1, so the first digit is 1.
2. Since 3 is smaller than 4, we need to borrow 1 - so we have 13 base 5, which is 8. 8 - 4 = 4. So
the second digit is 4.
3. For the third digit, we just subtract the borrowed 1, so it's 2.

So the result is ...3333241.

Multiplication and division get even stranger in P-adic. Because we can't have an infinite number of digits to the right of the decimal, p-adic ends up representing fractions using infinite numbers of digits on the right of the decimal. And that means that we get collisions between fractions and negative numbers. (This should start to give you a clue why each p-adic base is really a different number system: the correspondance between roll-over negatives and infinitely long fractions is different in each base.) It's easiest to see how this works by looking at a concrete example.

The fraction 1/3 can't be written as finite-length string in base-5. In 5-adic, that means we can't write it using digits to the right of the decimal point - we would need an infinite number of digits, and we're not allowed to do that. Instead, we need to write it with an infinite number of digits to the left! 1/3 in 5-adic is: ...1313132.

Looks crazy, but it does work: if you do a tabular multiplication, right to left, multiplying ...1313132 by 3 gives you one! Let's work it through:

• Start from the right: the rightmost digit is 2. 3*2 is 6, which is 11 in base 5; so the rightmost digit is 1, and you carry a one.
• The next digit is 3: 3 times 3 is - 9, plus the carried 1, gives 10 - which is 20 in base-5, so the next digit is a 0, and you carry 2.
• The next digit is 1: 3*1 is 3 plus the carried 2 = 5; so 0, carry 1.

And so on - ....13131313132 * 3 = 1, so ....131313132 == 1/3 in 5-adic.

How can we compute the value of 1/3? Just like decimal real numbers: division.

Division in p-adics is actually easy. The trick is that like all of the other arithmetic, it goes from right to left. Suppose we want to divide N by M. To make it easy, we'll talk about the digits of M and N using subscripts. The rightmost digit of a number X is X1; the second-rightmost is X2, etc. The multiplication algorithm is:

1. Start at the rightmost digit of both numbers.
2. Find the smallest number d which, multiplied by M, has as Ni as its rightmost digit.
3. Subtract d*Mi from N.
4. Drop the trailing last digits from N, giving N'.
5. Now divide N' by M, and put d on the right.

Let's walk through 1/3 in 5-adic:

• The rightmost digit of 1 is 1.
• What, multiplied by 3 will have a trailing digit of 1 in base-5? 2*3=6, which is 11 in base-5. So d = 2.
• Now we subtract the "11" from 1 - which is really ...00001. So it becomes ...444440.
• We drop the trailing 0, and N' is ...4444.
• So now we divide ...4444 by 3. What's the smallest number which, multiplied by 3, ends in a 4 in base-5? 3*3=9, which is 14 in base-5. So the next digit of the result is 3.
• Now, we subtract 14 from ...4444. That gives us ...44430. Drop the zero, and it's ...4443
• Next digit is a 1, leaving ...444

Crazy, huh? There is one catch about division: it only really works if the p-base in your p-adic system is a prime number. Otherwise, you get trouble - your p-adic system of numbers isn't a field if p is non-prime.

If you think about it, arithmetic with the P-adics is actually simpler than it is with conventional real numbers. Everything goes right to left. It's all more uniform. In fact, p-adic has been seriously proposed as a number representation for computer hardware, because the hardware is much easier to build when everything can be done uniformly right to left!

There's a lot more wierdness in the p-adics. We haven't even started to talk about metrics and distances in p-adic numbers. That's both where the p-adics get even stranger, and where they actually get useful. That'll be the next post, hopefully within a day or two!

Models and Why They Matter

As I said in the last post, Church came up with λ-calculus, which looks like it's a great formal model of computation. But - there was a problem. Church struggled to find a model. What's a model, and why would that matter? That's the point of this post. To get a quick sense of what a model is, and why it matters?

A model is basically a mapping from the symbols of a logical system to some set off objects, such that all statements that you can prove in the logical system will be true about the corresponding objects. Note that when I say object here, I don't necessarily mean real-world physical objects - they're just something that we can work with, which is well-defined and consistent.

Why does it matter? Because the whole point of a system like λ-calculus is because we want to use it for reasoning. When you have a logical system like λ-calculus, you've built this system with its rules for a reason - because you want to use it as a tool for understanding something. The model provides you with a way of saying that the conclusions you derive using the system are meaningful. If the model isn't correct, if it contains any kind of inconsistency, then your system is completely meaningless: it can be used to derive anything.

So the search for a model for λ-calculus is really important. If there's a valid model for it, then it's wonderful. If there isn't, then we're just wasting our time looking for one.

So, now, let's take a quick look at a simple model, to see how a problem can creep in. I'm going to build a logic for talking about the natural numbers - that is, integers greater than or equal to zero. Then I'll show you how invalid results can be inferred using it; and finally show you how it fails by using the model.

One quick thing, to make the notation easier to read: I'm going to use a simple notion of types. A type is a set of atoms for which some particular one-parameter predicate is true. For example, if is true, I'll say that x is a member of type P. In a quantifier, I'll say things like to mean . Used this way, we can say that P is a type predicate.

How do we define natural numbers using logic?

First, we need an infinite set of atoms, each of which represents one number. We pick one of them, and call it zero. To represent the fact that they're natural numbers, we define a predicate , which is true if and only if x is one of the atoms that represents a natural number.

Now, we need to start using predicates to define the fundamental properties of numbers. The most important property of natural numbers is that they are a sequence. We define that idea using a predicate, , where is true if and only if x = y + 1. To use that to define the ordering of the naturals, we can say: .

Or in english: every natural number has a successor - you can always add one to a natural number and get another natural number.

We can also define predecessor similarly, with two statements:

1. .

So every number has a predecessor, and every number has a successor, and x is the predecessor of y if y is the successor of x.

To be able to define things like addition and subtraction, we can use successor. Let's define addition using a predicate Sum(x,y,z) which means "z = x + y".

Again, in english: for any two natural numbers, there is a natural number that it their sum; x + 0 always = x; and for any natural number, x + y = z is true if (x + 1) + (y - 1) = z.

Once we have addition, subtraction is easy:

That's: x-y=z if and only if x=y+z.

We can also define greater than using addition:

• .

That's x > y if you can get to x from y by adding one repeatedly.

So, we've got a nice definition of natural numbers here, right?

Almost. There's one teeny little mistake.

We said that every natural number has a successor and a predecessor, and we also said that a number x is greater than a number y if you can get from y to x using a sequence of successors. That means that 0 has a predecessor, and that the predecessor of 0 is less than 0. But we're supposed to be defining the natural numbers! And one of the standard axioms of the natural numbers is that . But we've violated that - we have both , and
. And with a contradiction like that in the system, we can prove anything we want, anything at all. We've got a totally worthless, meaningless system.

That's why mathematicians are so particular about proving the validity of their models: because the tiniest error can mean that anything you proved with the logic might not be true - your proofs are worthless.

Representational Crankery: the New Reals and the Dark Number

There's one kind of crank that I haven't really paid much attention to on this blog, and that's the real number cranks. I've touched on real number crankery in my little encounter with John Gabriel, and back in the old 0.999...=1 post, but I've never really given them the attention that they deserve.

There are a huge number of people who hate the logical implications of our definitions real numbers, and who insist that those unpleasant complications mean that our concept of real numbers is based on a faulty definition, or even that the whole concept of real numbers is ill-defined.

This is an underlying theme of a lot of Cantor crankery, but it goes well beyond that. And the basic problem underlies a lot of bad mathematical arguments. The root of this particular problem comes from a confusion between the representation of a number, and that number itself. "" isn't a number: it's a notation that we understand refers to the number that you get by dividing one by two.

There's a similar form of looniness that you get from people who dislike the set-theoretic construction of numbers. In classic set theory, you can construct the set of integers by starting with the empty set, which is used as the representation of 0. Then the set containing the empty set is the value 1 - so 1 is represented as { 0 }. Then 2 is represented as { 1, 0 }; 3 as { 2, 1, 0}; and so on. (There are several variations of this, but this is the basic idea.) You'll see arguments from people who dislike this saying things like "This isn't a construction of the natural numbers, because you can take the intersection of 8 and 3, and set intersection is meaningless on numbers." The problem with that is the same as the problem with the notational crankery: the set theoretic construction doesn't say "the empty set is the value 0", it says "in a set theoretic construction, the empty set can be used as a representation of the number 0.

The particular version of this crankery that I'm going to focus on today is somewhat related to the inverse-19 loonies. If you recall their monument, the plaque talks about how their work was praised by a math professor by the name of Edgar Escultura. Well, it turns out that Escultura himself is a bit of a crank.

The specify manifestation of his crankery is this representational issue. But the root of it is really related to the discomfort that many people feel at some of the conclusions of modern math.

A lot of what we learned about math has turned out to be non-intuitive. There's Cantor, and Gödel, of course: there are lots of different sizes of infinities; and there are mathematical statements that are neither true nor false. And there are all sorts of related things - for example, the whole ideaof undescribable numbers. Undescribable numbers drive people nuts. An undescribable number is a number which has the property that there's absolutely no way that you can write it down, ever. Not that you can't write it in, say, base-10 decimals, but that you can't ever write down anything, in any form that uniquely describes it. And, it turns out, that the vast majority of numbers are undescribable.

This leads to the representational issue. Many people insist that if you can't represent a number, that number doesn't really exist. It's nothing but an artifact of an flawed definition. Therefore, by this argument, those numbers don't exist; the only reason that we think that they do is because the real numbers are ill-defined.

This kind of crackpottery isn't limited to stupid people. Professor Escultura isn't a moron - but he is a crackpot. What he's done is take the representational argument, and run with it. According to him, the only real numbers are numbers that are representable. What he proposes is very nearly a theory of computable numbers - but he tangles it up in the representational issue. And in a fascinatingly ironic turn-around, he takes the artifacts of representational limitations, and insists that they represent real mathematical phenomena - resulting in an ill-defined number theory as a way of correcting what he alleges is an ill-defined number theory.

His system is called the New Real Numbers.

In the New Real Numbers, which he notates as , the decimal notation is fundamental. The set of new real numbers consists exactly of the set of numbers with finite representations in decimal form. This leads to some astonishingly bizarre things. From his paper:

3) Then the inverse operation to multiplication called division; the result of dividing a decimal by another if it exists is called quotient provided the divisor is not zero. Only when the integral part of the devisor is not prime other than 2 or 5 is the quotient well defined. For example, 2/7 is ill defined because the quotient is not a terminating decimal (we interpret a fraction as division).

So 2/7ths is not a new real number: it's ill-defined. 1/3 isn't a real number: it's ill-defined.

4) Since a decimal is determined or well-defined by its digits, nonterminating decimals are ambiguous or ill-defined. Consequently, the notion irrational is ill-defined since we cannot cheeckd all its digits and verify if the digits of a nonterminaing decimal are periodic or nonperiodic.

After that last one, this isn't too surprising. But it's still absolutely amazing. The square root of two? Ill-defined: it doesn't really exist. e? Ill-defined, it doesn't exist. ? Ill-defined, it doesn't really exist. All of those triangles, circles, everything that depends on e? They're all bullshit according to Escultura. Because if he can't write them down in a piece of paper in decimal notation in a finite amount of time, they don't exist.

Of course, this is entirely too ridiculous, so he backtracks a bit, and defines a non-terminating decimal number. His definition is quite peculiar. I can't say that I really follow it. I think this may be a language issue - Escultura isn't a native english speaker. I'm not sure which parts of this are crackpottery, which are linguistic struggles, and which are notational difficulties in reading math rendered as plain text.

5) Consider the sequence of decimals,

(d)^na_1a_2...a_k, n = 1, 2, ..., (1)

where d is any of the decimals, 0.1, 0.2, 0.3, ..., 0.9, a_1, ..., a_k, basic integers (not all 0 simultaneously). We call the nonstandard sequence (1) d-sequence and its nth term nth d-term. For fixed combination of d and the a_j's, j = 1, ..., k, in (1) the nth term is a terminating decimal and as n increases indefinitely it traces the tail digits of some nonterminating decimal and becomes smaller and smaller until we cannot see it anymore and indistinguishable from the tail digits of the other decimals (note that the nth d-term recedes to the right with increasing n by one decimal digit at a time). The sequence (1) is called nonstandard d-sequence since the nth term is not standard g-term; while it has standard limit (in the standard norm) which is 0 it is not a g-limit since it is not a decimal but it exists because it is well-defined by its nonstandard d-sequence. We call its nonstandard g-limit dark number and denote by d. Then we call its norm d-norm (standard distance from 0) which is d > 0. Moreover, while the nth term becomes smaller and smaller with indefinitely increasing n it is greater than 0 no matter how large n is so that if x is a decimal, 0 < d < x.

I think that what he's trying to say there is that a non-terminating decimal is a sequence of finite representations that approach a limit. So there's still no real infinite representations - instead, you've got an infinite sequence of finite representations, where each finite representation in the sequence can be generated from the previous one. This bit is why I said that this is nearly a theory of the computable numbers. Obviously, undescribable numbers can't exist in this theory, because you can't generate this sequence.

Where this really goes totally off the rails is that throughout this, he's working on the assumption that there's a one-to-one relationship between representations and numbers. That's what that "dark number" stuff is about. You see, in Escultura's system, 0.999999... is not equal to one. It's not a representational artifact. In Escultura's system, there are no representational artifacts: the representations are the numbers. The "dark number", which he notates as , is (1-0.99999999...) and . In fact, is the smallest number greater than 0. And you can generate a complete ordered enumeration of all of the new real numbers, .

Reading Escultura, every once in a while, you might think he's joking. For example, he claims to have disproven Fermat's last theorem. Fermat's theorem says that for n>2, there are no integer solutions for the equation . Escultura says he's disproven this:

The exact solutions of Fermat's equation, which are the counterexamples to FLT, are given by the triples (x,y,z) = ((0.99...)10^T,d*,10^T), T = 1, 2, ..., that clearly satisfies Fermat's equation,

x^n + y^n = z^n, (4)

for n = NT > 2. Moreover, for k = 1, 2, ..., the triple (kx,ky,kz) also satisfies Fermat's equation. They are the countably infinite counterexamples to FLT that prove the conjecture false. One counterexample is, of course, sufficient to disprove a conjecture.

Even if you accept the reality of the notational artifact , this makes no sense: the point of Fermat's last theorem is that there are no integer solutions; is not an integer; is not an integer. Surely he's not that stupid. Surely he can't possibly believe that he's disproven Fermat using non-integer solutions? I mean, how is this different from just claiming that you can use (2, 3, 351/3) as a counterexample for n=3?

But... he's serious. He's serious enough that he's published published a real paper making the claim (albeit in crackpot journals, which are the only places that would accept this rubbish).

Anyway, jumping back for a moment... You can create a theory of numbers around this rubbish. The problem is, it's not a particularly useful theory. Why? Because it breaks some of the fundamental properties that we expect numbers to have. The real numbers define a structure called a field, and a huge amount of what we really do with numbers is built on the fundamental properties of the field structure. One of the necessary properties of a field is that it has unique identity elements for addition and multiplication. If you don't have unique identities, then everything collapses.

So... Take . That's the multiplicative inverse of 9. So, by definition, - the multiplicative identity.

In Escultura's theory, is a shorthand for the number that has a representation of 0.1111.... So, . So is also a multiplicative identity. By a similar process, you can show that itself must be the additive identity. So either , or else you've lost the field structure, and with it, pretty much all of real number theory.

The Surprises Never Eend: The Ulam Spiral of Primes

One of the things that's endlessly fascinating to me about math and
science is the way that, no matter how much we know, we're constantly
discovering more things that we don't know. Even in simple, fundamental
areas, there's always a surprise waiting just around the corner.

A great example of this is something called the Ulam spiral,
named after Stanislaw Ulam, who first noticed it. Take a sheet of graph paper.
Put "1" in some square. Then, spiral out from there, putting one number in
each square. Then circle each of the prime numbers. Like the following:

If you do that for a while - and zoom out, so that you can't see the numbers,
but just dots for each circled number, what you'll get will look something like
this:

That's the Ulam spiral filling a 200x200 grid. Look at how many diagonal
line segments you get! And look how many diagonal line segments occur along
the same lines! Why do the prime numbers tend to occur in clusters
along the diagonals of this spiral? I don't have a clue. Nor, to my knowledge,
does anyone else!

And it gets even a bit more surprising: you don't need to start
the spiral with one. You can start it with one hundred, or seventeen thousand. If
you draw the spiral, you'll find primes along diagonals.

Intuitions about it are almost certainly wrong. For example, when I first
thought about it, I tried to find a numerical pattern around the diagonals.
There are lots of patterns. For example, one of the simplest ones is
that an awful lot of primes occur along the set of lines
f(n) = 4n2+n+c, for a variety of values of b and c. But what does
that tell you? Alas, not much. Why do so many primes occur along
those families of lines?

You can make the effect even more prominent by making the spiral
a bit more regular. Instead of graph paper, draw an archimedean spiral - that
is, the classic circular spiral path. Each revolution around the circle, evenly
distribute the numbers up to the next perfect square. So the first spiral will have 2, 3, 4;
the next will have 5, 6, 7, 8, 9. And so on. What you'll wind up with is
called the Sack's spiral, which looks like this:

This has been cited by some religious folks as being a proof of the
existence of God. Personally, I think that that's silly; my personal
belief is that even a deity can't change the way the numbers work: the
nature of the numbers and how they behave in inescapable. Even a deity who
could create the universe couldn't make 4 a prime number.

Even just working with simple integers, and as simple a concept of
the prime numbers, there are still surprises waiting for us.

Sorry, Denise - but God didn't make numbers

I was planning on ignoring this one, but tons of readers have been writing
to me about the latest inanity spouting from the keyboard of Discovery
Institute's flunky, Denise O'Leary.

Here's what she had to say:

Even though I am not a creationist by any reasonable definition,
I sometimes get pegged as the local gap tooth creationist moron. (But then I
don't have gaps in my teeth either. Check unretouched photos.)

As the best gap tooth they could come up with, a local TV station interviewed
me about "superstition" the other day.

The issue turned out to be superstition related to numbers. Were they hoping
I'd fall in?

The skinny: Some local people want their house numbers changed because they
feel the current number assignment is "unlucky."

Look, guys, numbers here are assigned on a strict directional rota. If the
number bugs you so much, move.

Don't mess up the street directory for everyone else. Paramedics, fire chiefs,
police chiefs, et cetera, might need a directory they can make sense of. You
might be glad for that yourself one day.

Anyway, I didn't get a chance to say this on the program so I will now: No
numbers are evil or unlucky. All numbers are - in my view - created by God to
march in a strict series or else a discoverable* series, and that is what
makes mathematics possible. And mathematics is evidence for design, not
superstition.

The interview may never have aired. I tend to flub the gap-tooth creationist
moron role, so interviews with me are often not aired.

* I am thinking here of numbers like pi, that just go on and on and never
shut up, but you can work with them anyway.(You just decide where you want
to cut the mike.)

You can't write that number; in fact, you can't write most numbers.

In my Dembski rant, I used a metaphor involving the undescribable numbers. An interesting confusion came up in the comments about just what that meant. Instead of answering it with a comment, I decided that it justified a post of its own. It's a fascinating topic which is incredibly counter-intuitive. To me, it's one of the great examples of how utterly wrong our
intuitions can be.

Numbers are, obviously, very important. And so, over the ages, we've invented lots of notations that allow us to write those numbers down: the familiar arabic notation, roman numerals, fractions, decimals, continued fractions, algebraic series, etc. I could easily spend months on this blog just writing about different notations that we use to write numbers, and the benefits and weaknesses of each notation.

But the fact is, the vast, overwhelming majority of numbers cannot be written
down in any form
.

That statement seems bizarre at best. But it does actually make sense. But for it to
make sense, we have to start at the very beginning: What does it mean for a number to be describable?

Basics: Significant Figures

After my post the other day about rounding errors, I got a ton of
requests to explain the idea of significant figures. That's
actually a very interesting topic.

The idea of significant figures is that when you're doing
experimental work, you're taking measurements - and measurements
always have a limited precision. The fact that your measurements - the
inputs to any calculation or analysis that you do - have limited
precision, means that the results of your calculations likewise have
limited precision. Significant figures (or significant digits, or just "sigfigs" for short) are a method of tracking measurement
precision, in a way that allows you to propagate your precision limits

Before getting to the rules for sigfigs, it's helpful to show why
they matter. Suppose that you're measuring the radius of a circle, in
order to compute its area. You take a ruler, and eyeball it, and end
up with the circle's radius as about 6.2 centimeters. Now you go to
compute the area: π=3.141592653589793... So what's the area of the
circle? If you do it the straightforward way, you'll end up with a
result of 120.76282160399165 cm2.