A White Boy's Observations of Sexism and the Adria Richards Fiasco

(by MarkCC) Mar 28 2013

I've been watching the whole Adria Richards fiasco with a sense of horror and disgust. I'm finally going to say something, but for the most part, it's going to be indirect.

See, I'm a white guy, born as a member of an upper middle class white family. That means that I'm awfully lucky. I'm part of the group that is, effectively, treated as the normal, default person in most settings. I'm also a guy who's married to a chinese woman, and who's learned a bit about how utterly clueless I am.

Here's the fundamental issue that underlies all of this, and many similar stories: our society is deeply sexist and racist. We are all raised in an environment in which mens voices are more important than womens. It's so deeply ingrained in us that we don't even notice it.

What this means is that we are all to some degree, sexist, and racist. When I point this out, people get angry. We also have learned that sexism is a bad thing. So when I say to someone that you are sexist, it's really easy to interpret that as me saying that you're a bad person: sexism is bad, if I'm sexist, them I'm bad.

But we really can't get away from this reality. We are sexists. For many of us, we're not deliberately sexist, we're not consciously sexist. But we are sexist.

Here's a really interesting experiment to try, if you have the opportunity. Visit an elementary school classroom. First, just watch the teacher interact with the students while they're teaching. Don't try to count interactions. Just watch. See if you think that any group of kids is getting more attention than any other. Most of the time, you probably will get a feeling that they're paying roughly equal attention to the boys and the girls, or to the white students and the black students. Then, come back on a different day, and count the number of times that they call on boys versus calling on girls. I've done this, after having the idea suggested by a friend. The result was amazing. I really, honestly believed that the teacher was treating her students (the teacher I did this with was a woman) equally. But when I counted?She was calling on boys twice as often as girls.

This isn't an unusual outcome. Do some looking online for studies of classroom gender dynamics, and you'll find lots of structured observations that come to the same conclusion.

My own awakening about these kinds of things came from my time working at IBM. I've told this first story before, but it's really worth repeating.

One year, I managed the summer intership programs for my department. The previous summer, IBM research had wound up with an intership class consisting of 99% men. (That's not an estimate: that's a real number. That year, IBM research hired 198 summer interns, of whom 2 were women.) For a company like IBM, numbers like that are scary. Ignoring all of the social issues of excluding potentially great candidates, numbers like that can open the company up to gender discrimination lawsuits!

So my year, they decided to encourage the hiring of more diverse candidates. The way that they did that was by allocating each department a budget for summer interns. They could only hire up to their budgeted number of interns. Only women and minority candidates didn't count against the budget.

When the summer program hiring opened, my department was allocated a budget of six students. All six slots were gone within the first day. Every single one of them went to a white, american, male student.

The second day, the guy across the hall from me came with a resume for a student he wanted to hire. This was a guy who I really liked, and really respected greatly. He was not, by any reasonable measure, a bad guy - he was a really good person. Anyway, he had this resume, for yet another guy. I told him the budget was gone, but if he could find a good candidate who was either a woman or minority, that we could hire them. He exploded, ranting about how we were being sexist, discriminating against men. He just wanted to hire the best candidate for the job! We were demanding that he couldn't hire the best candidate, he had to hire someone less qualified, in order to satisfy some politically correct bureaucrat! There was nothing I could do, so eventually he stormed out.

Three days later, he came back to my office with another resume. He was practically bouncing off the walls he was so happy. "I found another student to hire. She's even better than the guy I originally came to you with! She's absolutely perfect for the job!". We hired her.

I asked him why he didn't find her before. He had no answer - he didn't know why he didn't find her resume of his first search.

This was a pattern that I observed multiple times that year. Looking through a stack of resumes, without deliberately excluding women, somehow, all of the candidates with female names wound up back in the slushpile. I don't think that anyone was deliberately saying "Hmm, Jane, that's a woman's name, I don't want to hire a woman". But I do think that in the process of looking through a file containing 5000 resumes, trying to select which ones to look at, on an unconscious level, they were more likely to look carefully at a candidate with a male name, because we all learn, from a young age, that men are smarter than women, men are more serious than women, men are better workers than women, men are more likely to be technically skilled than women. Those attitudes may not be part of our conscious thought, but they are part of the cultural background that gets drummed into us by school, by books, by movies, by television, by commercials.

As I said, that was a real awakening for me.

I was talking about this with my next-door office neighbor, who happened to be one of the only two women in my department (about 60 people) at the time. She was shocked that I hadn't noticed this before. So she pointed out to me that in meetings, she could say things, and everyone would ignore it, but if a guy said the same thing, they'd get listened to. We'd been in many meetings together, and I'd never noticed this!

So I started paying attention, and she was absolutely right.

What happened next is my second big awakening.

I started watching this in meetings, and when people brushed over something she'd said, I'd raise my voice and say "X just suggested blah, which I think is a really good idea. What about it?". I wanted to help get her voice listened to.

She was furious at me. This just blew my mind. I was really upset at her at first. Dammit, I was trying to help, and this asshole was yelling at me for it! She'd complained about how people didn't listen to her, and now when I was trying to help get her listened to, she was complaining again!

What I realized after I calmed down and listened to her was that I was wrong. I hadn't spoken to her about doing it. I didn't understand what it meant. But the problem was, people didn't take her seriously because she was a woman. People might listen to me, because I'm also a white guy. But when I spoke for her, I wasn't helping. When a man speaks on behalf of a woman, we're reinforcing the idea that a woman's voice isn't supposed to be heard. I was substituting my man's voice for her woman's, and by doing that, I was not just not helping her, but I was actively hurting, because the social interpretation of my action was that "X can't speak for herself". And more, I learned that by taking offense at her, for pointing out that I had screwed up, I was definitely in the wrong - that I had an instinct for reacting wrong.

What I learned, gradually, from watching things like this, from becoming more sensitive and aware, and by listening to what women said, was that this kind of thing is that I was completely clueless.

The fact is, I constantly benefit from a very strong social preference. I don't notice that. Unless I'm really trying hard to pay attention, I'm not aware of all of the benefits that I get from that. I don't notice all of the times when I'm getting a benefit. Worse, I don't notice all of the times when my behavior is asserting that social preference as my right.

It's very easy for a member of an empowered majority to just take things for granted. We see the way that we are treated as a default, and assume that everyone is treated the same way. We don't perceive that we are being treated preferentially. We don't notice that the things that offend us are absolutely off limits to everyone, but that things that we do to offend others are accepted as part of normal behavior. Most importantly, we don't notice when our behavior is harmful to people who aren't part of our empowered group. And when we do offend someone who isn't part of the empowered majority, we take offense at the fact that they're offended. Because they're saying that we did something bad, and we know that we aren't bad people!

The way that this comes back to the whole Adria Richards fiasco is very simple. Many people have looked at what happened at PyCon, and said something like "She shouldn't have tweeted their picture", or "She shouldn't have been offended, they didn't do anything wrong", or "She should have just politely spoken to them".

I don't know whether what she did was right or not. I wasn't there. I didn't hear the joke that the guys in question allegedly told. What I do know is that for a member of the minority out-group, there is frequently no action that will be accepted as "right" if it includes the assertion that the majority did something offensive.

I've seen this phenomena very directly myself, not in the context of sexism, but in terms of antisemitism. There's an expression that I've heard multiple times in the northeast US, to talk about bartering a price for a car: "jewing the salesman down". I absolutely find that extremely offensive. And I've called people out on it. There is no response that's actually acceptable.

If I politely say "You know, that's relying on a stereotype of me and my ancestors that's really hurtful", the response is: "Oh, come on, it's just harmless. I'm not talking about you, it's just a word. You're being oversensitive". If I get angry, the response is "You Jews are so strident". If I go to an authority figure in the setting, "You Jews are so passive aggressive, why couldn't you just talk to me?". No matter what I do, I'm wrong. Women deal with this every day, only they're in a situation where the power dynamic is even less in their favor.

That's the situation that women - particularly women in tech - find themselves in every day. We are sexist. We do mistreat women in tech every day, without even knowing that we're doing it. And we're very likely to take offense if they mention that we did something wrong. Because we know that we're good people, and since we aren't deliberately doing something bad, they must be wrong.

For someone in Adria Richards' situation at PyCon, there is no course of action that can't be taken wrong. As a woman hearing the joke in question, she certainly knew whether or not it was offensive to her. But once she'd heard something offensive, there was nothing she could do that someone couldn't turn into a controversy.

Was the joke offensive? We don't know what, specifically, he said. The only fact that we're certain of is that in her judgement, it was offensive; that the authorities at PyCon agreed, and asked the gentleman in question to apologize.

Did the guy who made the joke deserve to be fired? I don't know. If this stupid joke were the first time he'd ever done something wrong, then he didn't deserve to be fired. But we don't know what his history is like. I know how hard it is to hire skilled engineers, so I'm very skeptical that any company would fire someone over one minor offense. It's possible that his company has a crazy hair-trigger HR department. But it's also possible that there's background that we don't know about. That he's done stuff before, and been warned. If that's the case, then his company could have decided that this was the last straw.

Did Adria Richards deserve to be fired? Almost certainly not. We know more about her case than we do about the guy who told the joke. We know that her company fired her over this specific incident, because in their announcement of her firing, they told us the reason. They didn't cite any past behavior - they just specifically cited this incident and its aftermath as the reason for firing her. It's possible that there's a history here that we don't know about, that she'd soured relations with customers of her company in incidents other than this, and that this was a last straw. But it doesn't seem likely, based on the facts that we're aware of.

Did either of them deserve to be threatened? Absolutely not.

127 responses so far

Genius Continuum Crackpottery

(by MarkCC) Mar 21 2013

There's a lot of mathematical crackpottery out there. Most of it is just pointless and dull. People making the same stupid mistakes over and over again, like the endless repetitions of the same-old supposed refutations of Cantor's diagonalization.

After you eliminate that, you get reams of insanity - stuff which
is simply so incoherent that it doesn't make any sense. This kind of thing is usually word salad - words strung together in ways that don't make sense.

After you eliminate that, sometimes, if you're really lucky, you'll come accross something truly special. Crackpottery as utter genius. Not genius in a good way, like they're an outsider genius who discovered something amazing, but genius in the worst possible way, where someone has created something so bizarre, so overwrought, so utterly ridiculous that it's a masterpiece of insane, delusional foolishness.

Today, we have an example of that: Existics!. This is a body of work by a high school dropout named Gavin Wince with truly immense delusions of grandeur. Pomposity on a truly epic scale!

I'll walk you through just a tiny sample of Mr. Wince's genius. You can go look at his site to get more, and develop a true appreciation for this. He doesn't limit himself to mere mathematics: math, physics, biology, cosmology - you name it, Mr. Wince has mastered it and written about it!

The best of his mathematical crackpottery is something called C3: the Canonized Cardinal Continuum. Mr. Wince has created an algebraic solution to the continuum hypothesis, and along the way, has revolutionized number theory, algebra, calculus, real analysis, and god only knows what else!

Since Mr. Wince believes that he has solved the continuum hypothesis. Let me remind you of what that is:

  1. If you use Cantor's set theory to explore numbers, you get to the uncomfortable result that there are different sizes of infinity.
  2. The smallest infinite cardinal number is called ℵ0,
    and it's the size of the set of natural numbers.
  3. There are cardinal numbers larger than ℵ0. The first
    one larger than ℵ0 is ℵ1.
  4. We know that the set of real numbers is the size of the powerset
    of the natural numbers - 20 - is larger than the set of the naturals.
  5. The question that the continuum hypothesis tries to answer is: is the size
    of the set of real numbers equal to ℵ1? That is, is there
    a cardinal number between ℵ0 and |20|?

The continuum hypothesis was "solved" in 1963. In 1940, Gödel showed that you couldn't disprove the continuum hypothesis using ZFC. In 1963,
another mathematician named Paul Cohen, showed that it couldn't be proven using ZFC. So - a hypothesis which is about set theory can be neither proven nor disproven using set theory. It's independent of the axioms of set theory. You can choose to take the continuum hypothesis as an axiom, or you can choose to take the negation of the continuum hypothesis as an axiom: either choice is consistent and valid!

It's not a happy solution. But it's solved in the sense that we've got a solid proof that you can't prove it's true, and another solid proof that you can't prove it's false. That means that given ZFC set theory as a basis, there is no proof either way that doesn't set it as an axiom.

But... Mr. Wince knows better.

The set of errors that Wince makes is really astonishing. This is really seriously epic crackpottery.

He makes it through one page without saying anything egregious. But then he makes up for it on page 2, by making multiple errors.

First, he pulls an Escultura:

x1 = 1/21 = 1/2 = 0.5

x2 = 1/21 + 1/22 = 1/2 + 1/4 = 0.75

x3 = 1/21 + 1/22 + 1/23 = 1/2 + 1/4 + 1/8 = 0.875

...

At the end or limit of the infinite sequence, the final term of the sequence is 1.0

...

In this example we can see that as the number of finite sums of the sequence approaches the limit infinity, the last term of the sequence equals one.

xn = 1.0

If we are going to assume that the last term of the sequence equals one, it can be deduced that, prior to the last term in the sequence, some finite sum in the series occurs where:

xn-1 = 0.999…

xn-1 = 1/21 + 1/22 + 1/23 + 1/24 + … + 1/2n-1 = 0.999…

Therefore, at the limit, the last term of the series of the last term of the sequence would be the term, which, when added to the sum 0.999… equals 1.0.

There is no such thing as the last term of an infinite sequence. Even if there were, the number 0.999.... is exactly the same as 1. It's a notational artifact, not a distinct number.

But this is the least of his errors. For example, the first paragraph on the next page:

The set of all countable numbers, or natural numbers, is a subset of the continuum. Since the set of all natural numbers is a subset of the continuum, it is reasonable to assume that the set of all natural numbers is less in degree of infinity than the set containing the continuum.

We didn't need to go through the difficult of Cantor's diagonalization! We could have just blindly asserted that it's obvious!

or actually... The fact that there are multiple degrees of infinity is anything but obvious. I don't know anyone who wasn't surprised the first time they saw Cantor's proof. It's a really strange idea that there's something bigger than infinity.

Moving on... the real heart of his stuff is built around some extremely strange notions about infinite and infinitessimal values.

Before we even look at what he says, there's an important error here
which is worth mentioning. What Mr. Wince is trying to do is talk about the
continuum hypothesis. The continuum hypothesis is a question about the cardinality of the set of real numbers and the set of natural numbers.
Neither infinites nor infinitessimals are part of either set.

Infinite values come into play in Cantor's work: the cardinality of the natural numbers and the cardinality of the reals are clearly infinite cardinal numbers. But ℵ0, the smallest infinite cardinal, is not a member of either set.

Infinitessimals are fascinating. You can reconstruct differential and integral calculus without using limits by building in terms of infinitessimals. There's some great stuff in surreal numbers playing with infinitessimals. But infinitessimals are not real numbers. You can't reason about them as if they were members of the set of real numbers, because they aren't.

Many of his mistakes are based on this idea.

For example, he's got a very strange idea that infinites and infinitessimals don't have fixed values, but that their values cover a range. The way that he gets to that idea is by asserting the existence
of infinity as a specific, numeric value, and then using it in algebraic manipulations, like taking the "infinityth root" of a real number.

For example, on his way to "proving" that infinitessimals have this range property that he calls "perambulation", he defines a value that he calls κ:

\[ sqrt[infty]{infty} = 1 + kappa\]

In terms of the theory of numbers, this is nonsense. There is no such thing as an infinityth root. You can define an Nth root, where N is a real number, just like you can define an Nth power - exponents and roots are mirror images of the same concept. But roots and exponents aren't defined for infinity, because infinity isn't a number. There is no infinityth root.

You could, if you really wanted to, come up with a definition of exponents that that allowed you to define an infinityth root. But it wouldn't be very interesting. If you followed the usual pattern for these things, it would be a limit: \(sqrt[infty]{x} lim_{nrightarrowinfty} sqrt[n]{x}\). That's clearly 1. Not 1 plus something: just exactly 1.

But Mr. Cringe doesn't let himself be limited by silly notions of consistency. No, he defines things his own way, and runs with it. As a result, he gets a notion that he calls perambulation. How?

Take the definition of κ:

\[ sqrt[infty]{infty} = 1 + kappa\]

Now, you can, obviously, raise both sides to the power of infinity:

\[infty = (1 + kappa)^{infty}\]

Now, you can substitute ℵ0 for \(infty\). (Why? Don't ask why. You just can.) Then you can factor it. His factoring makes no rational sense, so I won't even try to explain it. But he concludes that:

  • Factored and simplified one way, you end up with (κ+1) = 1 + x, where x is some infinitessimal number larger than κ. (Why? Why the heck not?)
  • Factored and simplified another way, you end up with (κ+1) = ℵ
  • If you take the mean of of all of the possible factorings and reductions, you get a third result, that (κ+1) = 2.

He goes on, and on, and on like this. From perambulation to perambulating reciprocals, to subambulation, to ambulation. Then un-ordinals, un-sets... this is really an absolute masterwork of utter insane crackpottery.

Do download it and take a look. It's a masterpiece.

No responses yet

Pi-day randomness

(by MarkCC) Mar 14 2013

One of my twitter friends was complaining about something that's apparently making the rounds of Facebook for π-day. It annoyed me sufficiently to be worth ranting about a little bit.

Why isn't π rational if π=circumference/diameter, and both measurements are plainly finite?

There's a couple of different ways of interpreting this question.

The stupidest way of interpreting it is that the author didn't have any clue of what an irrational number is. An irrational number is a number which cannot be written as a ratio of two integers. Another way of saying essentially the same thing is that there's no way to create a finite representation of an irrational number. I've seen people get this wrong before, where they confuse not having a finite representation with not being finite.

π doesn't have a finite representation. But it's very clearly finite - it's less that 3 1/4, which is obviously not infinite. Anyone who can look at π, and be confused about whether or not it's finite is... well... there's no nice way to say this. If you think that π isn't finite, you're an idiot.

The other way of interpreting this statement is less stupid: it's a question of measurement. If you have a circular object in real life, then you can measure the circumference and the diameter, and do the division on the measurements. The measurements have finite precision. So how can the ratio of two measurements with finite precision be irrational?

The answer is, they can't. But perfect circles don't exist in the real world. Many mathematical concepts don't exist in the real world. In the real world, there's no such thing as a mathematical point, no such thing as a perfect line, no such thing as perfectly parallel lines.

π isn't a measured quantity. It's a theoretical quantity, which can be computed analytically from the theoretical properties derived from the abstract properties of an ideal, perfect circle.

No "circle" in the real world has a perfect ratio of π between its circumference and its diameter. But the theoretical circle does.

The facebook comments on this get much worse than the original question. One in particular really depressed me.

Just because the measurements are finite doesn't mean they're rational.
Pi is possibly rational, we just haven't figured out where it ends.

Gah, no!

We know an awful lot about π. And we know, with absolute, 100% perfect certainty that π never ends.

We can define π precisely as a series, and that series makes it abundantly clear that it never ends.

\[pi = frac{4}{1} - frac{4}{3} + frac{4}{5} - frac{4}{7} + frac{4}{9} ...\]

That series goes on forever. π can't ever end, because that series never ends.

Just for fun, here's a little snippet of Python code that you can play with. You can see how, up to the limits of your computer's floating point representation, that a series computation of π keeps on going, changing with each additional iteration.

def pi(numiter):
  val = 3.0
  sign = 1
  for i in range(numiter):
    term = ((i+1)*2) * ((i+1)*2 + 1) * ((i+1) *2 + 2)
    val = val + sign*4.0/term
    sign = sign * -1
  return val

No responses yet

Finally: Gödel's Proof of Incompleteness!

(by MarkCC) Mar 12 2013

Finally, we're at the end of our walkthrough of Gödel great incompleteness proof. As a refresher, the basic proof sketch is:</p.

  1. Take a simple logic. We've been using a variant of the Principia Mathematica's logic, because that's what Gödel used.
  2. Show that any statement in the logic can be encoded as a number using an arithmetic process based on the syntax of the logic. The process of encoding statements numerically is called Gödel numbering.
  3. Show that you can express meta-mathematical properties of logical statements in terms of arithemetic properties of their Gödel numbers. In particular, we need to build up the logical infrastructure that we need to talk about whether or not a statement is provable.
  4. Using meta-mathematical properties, show how you can create an unprovable statement encoded as a Gödel number.

What came before:

  1. Gödel numbering: The logic of the Principia, and how to encode it as numbers. This was step 1 in the sketch.
  2. Arithmetic Properties: what it means to say that a property can be expressed arithemetically. This set the groundwork for step 2 in the proof sketch.
  3. Encoding meta-math arithmetically: how to take meta-mathematical properties of logical statements, and define them as arithmetic properties of the Gödel numberings of the statements. This was step 2 proper.

So now we can move on to step three, where we actually see why mathematical logic is necessarily incomplete.

When we left off with Gödel, we'd gone through a very laborious process showing how we could express meta-mathematical properties of logical statements as primitive recursive functions and relations. We built up to being able to express one non-primitive recursive property, which describes the property that a given statement is provable:

pred provable(x) =
  some y {
    proofFor(y, x)
  }
}

The reason for going through all of that was that we really needed to show how we could capture all of the necessary properties of logical statements in terms of arithmetic properties of their Gödel numbers.

Now we can get to the target of Gödel's effort. What Gödel was trying to do was show how to defeat the careful stratification of the Principia's logic. In the principia, Russell and Whitehead had tried to avoid problems with self-reference by creating a very strict stratification, where each variable or predicate had a numeric level, and could only reason about objects from lower levels. So if natural numbers were the primitive objects in the domain being reasoned about, then level-1 objects would be things like specific natural numbers, and level-1 predicates could reason about specific natural numbers, but not about sets of natural numbers or predicates over the natural numbers. Level-2 objects would be sets of natural numbers, and level-2 predicates could reason about natural numbers and sets of natural numbers, but not about predicates over sets of natural numbers, or sets of sets of natural numbers. Level-3 objects would be sets of sets of natural numbers... and so on.

The point of this stratification was to make self-reference impossible. You couldn't make a statement of the form "This predicate is true": the predicate would be a level-N predicate, and only a level N+1 predicate could reason about a level-N predicate.

What Gödel did in the laborious process we went through in the last
post is embed a model of logical statements in the natural numbers. That's the real trick: the logic is designed to work with a set of objects that are a model of the natural numbers. By embedding a model of logical statements in
the natural numbers, he made it possible for a level-1 predicate (a predicate about a specific natural number) to reason about any logical statement or object. A level-1 predicate can now reason about a level-7 object! A level-1 predicate can reason about the set defined by a level-1 predicate: a level-1 predicate can reason about itself!.

Now, we can finally start getting to the point of all of this: incompleteness! We're going to use our newfound ability to nest logical statements into numbers to construct an unprovable true statement.

In the last post, one of the meta-mathematical properties that we defined for the Gödel-numbered logic was immConseq, which defines when some statement x is an immediate consequence of a set of statements S. As a reminder, that means that x can be inferred from statements in S in one inferrence step.

We can use that property to define what it means to be a consequence of a set of statements: it's the closure of immediate consequence. We can define it in pseudo-code as:

def conseq(κ) = {
  K = κ + axioms
  added_to_k = false
  do {
    added_to_k = false
    for all c in immConseq(K) {
      if c not in K {
        add c to K
        added_to_k = true
      }
    }
  } while added_to_k
  return K
}

In other words, Conseq(κ) is the complete set of everything that can possibly be inferred from the statements in κ and the axioms of the system. We can say that there's a proof for a statement x in κ if and only if x ∈ Conseq(κ).

We can the idea of Conseq use that to define a strong version of what it means for a logical system with a set of facts to be consistent. A system is ω-consistent if and only if there is not a statement a such that: a ∈ Conseq(κ) ∧ not(forall(v, a)) ∈ Conseq(κ).

In other words, the system is ω-consistent as long as it's never true that both a universal statement and it. But for our purposes, we can treat it as being pretty much the same thing. (Yes, that's a bit hand-wavy, but I'm not trying to write an entire book about Gödel here!)

(Gödel's version of the definition of ω-consistency is harder to read than this, because he's very explicit about the fact that Conseq is a property of the numbers. I'm willing to fuzz that, because we've shown that the statements and the numbers are interchangable.)

Using the definition of ω-consistency, we can finally get to the actual statement of the incompleteness theorem!

Gödel's First Incompleteness Theorem: For every ω-consistent primitive recursive set κ of formulae, there is a primitive-recursive predicate r(x) such that neither forall(v, r) nor not(forall(v, r)) is provable.

To prove that, we'll construct the predicate r.

First, we need to define a version of our earlier isProofFigure that's specific to the set of statements κ:

pred isProofFigureWithKappa(x, kappa) = {
  all n in 1 to length(x) {
    isAxiom(item(n, x)) or
    item(n, x) in kappa or
    some p in 0 to n {
      some q in 0 to n {
        immedConseq(item(n, x), item(p, x), item(q, x))
      }
    }
  } and length(x) > 0
}

This is the same as the earlier definition - just specialized so that it ensures that every statement in the proof figure is either an axiom, or a member of κ.

We can do the same thing to specialize the predicate proofFor and provable:

pred proofForStatementWithKappa(x, y, kappa) = {
  isProofFigureWithKappa(x, kappa) and
  item(length(x), x) = y
}

pred provableWithKappa(x, kappa) = {
  some y {
    proofForStatementWithKappa(y, x, kappa)
  }
}

If κ is the set of basic truths that we can work with, then
provable in κ is equivalent to provable.

Now, we can define a predicate UnprovableInKappa:

pred NotAProofWithKappa(x, y, kappa) = {
  not (proofForKappa(x, subst(y, 19, number(y))))
}

Based on everything that we've done so far, NotAProofWithKappa is primitive recursive.

This is tricky, but it's really important. We're getting very close to the goal, and it's subtle, so let's take the time to understand this.

  • Remember that in a Gödel numbering, each prime number is a variable. So 19 here is just the name of a free variable in y.
  • Using the Principia's logic, the fact that variable 19 is free means that the statement is parametric in variable 19. For the moment, it's an incomplete statement, because it's got an unbound parameter.
  • What we're doing in NotAProofWithKappa is substituting the numeric coding of y for the value of y's parameter. When that's done, y is no longer incomplete: it's unbound variable has been replaced by a binding.
  • With that substitution, NotAProofWithKappa(x, y, kappa) is true when x does not prove that y(y) is true.

What NotAProofWithKappa does is give us a way to check whether a specific sequence of statements x is not a proof of y.

We want to expand NotAProofWithKappa to something universal. Instead of just saying that a specific sequence of statements x isn't a proof for y, we want to be able to say that no possible sequence of statements is a proof for y. That's easy to do in logic: you just wrap the statement in a "∀ x ( )". In Gödel numbering, we defined a function that does exactly that. So the universan form of provability is:
∀ a (NotAProofWithKappa(a, y, kappa)).

In terms of the Gödel numbering, if we assume that the Gödel number for the variable a is 17, and the variable y is numbered as 19, we're talking about the statement p = forall(17, ProvableInKappa(17, 19, kappa).

p is the statement that for some logical statement (the value of variable 19, or y in our definition), there is no possible value for variable 17 (a) where a proves y in κ.

All we need to do now is show that we can make p become self-referential. No problem: we can just put number(p) in as the value of y in UnprovableInKappa. If we let q be the numeric value of the statement UnprovableInKappa(a, y), then:

r = subst(q, 19, p)

i = subst(p, 19, r)

i says that there is no possible value x that proves p(p). In other words, p(p) is unprovable: there exists no possible proof that there is no possible proof of p!

This is what we've been trying to get at all this time: self-reference! We've got a predicate y which is able to express a property of itself. Worse, it's able to express a negative property of itself!

Now we're faced with two possible choices. Either i is provable - in which case, κ is inconsistent! Or else i is unprovable - in which case κ is incomplete, because we've identified a true statement that can't be proven!

That's it: we've shown that in the principia's logic, using nothing but arithmetic, we can create a true statement that cannot be proven. If, somehow, it were to be proven, the entire logic would be inconsistent. So the principia's logic is incomplete: there are true statements that cannot be proven true.

We can go a bit further: the process that we used to produce this result about the Principia's logic is actually applicable to other logics. There's no magic here: if your logic is powerful enough to do Peano arithmetic, you can use the same trick that we demonstrated here, and show that the logic must be either incomplete or inconsistent. (Gödel proved this formally, but we'll just handwave it.)

Looking at this with modern eyes, it doesn't seem quite as profound as it did back in Gödel's day.

When we look at it through the lens of today, what we see is that in the Principia's logic, proof is a mechanical process: a computation. If every true statement was provable, then you could take any statement S, and write a program to search for a proof of either S or ¬ S, and eventually, that program would find one or the other, and stop.

In short, you'd be able to solve the halting problem. The proof of the halting problem is really an amazingly profound thing: on a very deep level, it's the same thing as incompleteness, only it's easier to understand.

But at the time that Gödel was working, Turing hadn't written his paper about the halting problem. Incompletess was published in 1931; Turing's halting paper was published in 1936. This was a totally unprecedented idea when it was published. Gödel produced one of the most profound and surprising results in the entire history of mathematics, showing that the efforts of the best mathematicians in the world to produce the perfection of mathematics were completely futile.

No responses yet

Passwords, Hashing, and Salt

(by MarkCC) Mar 02 2013

Over on twitter, some folks were chatting about the latest big security botch. A major service, called Evernote, had a security breach where a password file was stolen. Evernote has handled the situation quite well, being open about what happened, and explaining the risks.

In their description of the breach, they said that the stolen passwords were "both hashed and salted". Apparently this sounds funny to people outside of software. (Amazing how jargon becomes so ingrained that I didn't even notice the fact that it could be interpreted in a funny way until it was pointed out to me!)

Anyway, since discussion of this is going around, I thought I'd explain just what password hashing and salting means.

Let's start at the beginning. You're some kind of system that wants to have password security. Obviously, you need to save the passwords somewhere, right?

As we'll see, that's only partially correct - but let's go with it for now. You need to store something that lets you check if a user supplied the right password, so you need to store something.

The most naive approach is create a file (or database, or whatever) that contains the usernames and passwords of all of your users. Something like:

alice:abc
mark:pass
joe:123
jen:hello

Suppose you were a thief, and you wanted to crack this password file, what would you do? You'd try to steal that file! If you can get hold of that password file, then you'd have all of the passwords for all of the users of the system.

That means that this is a terrible way of storing the passwords. One step, and a thief has completely breached your system. We don't want that. So what should we do?

First, we could encrypt the file.

This might seem like a good idea at first. If a thief were the steal the file, the wouldn't be able to find your user's passwords without figuring out the encryption key! It's going to take a lot of work to figure out that key!

The first problem with this is that any password can be cracked given enough time and power. If there's only one encryption key for the entire file, then it's worth investing a lot of time and power into breaking it - and once it's broken, then everything is revealed.

The second problem is: how does your system check a user's password? It needs to decrypt the file! That means that the encryption key must be available to your system! So all that a thief needs to do is figure out where your system is getting the key from. You've got your entire security system for all of your users set up with a single point of protection, and somewhere in your system, everything that you need to break that protection is available!

What can we do to improve this? The answer is something called crypto graphic hashing.

Cryptographic hashing is a combination of two concepts: hashing, and one-way functions.

A really simple example of a not-very-good hash function of a string would be something like: convert all of the characters in the string to their numeric values, and exclusive-or the binary representation of those bits. With that hash function, you could take a string like "ABCD", and convert it to the numeric values of the characters ([65, 66, 67, 68]), and then x-or them together (1000001 xor 1000010 xor 1000011 xor 1000100 = 0000100) for a result of 4. Real practical hash functions are more complicated.

For example, at least some versions of Java use the following as the default hash for a string of characters:

\[text{hash}(s) = left(sum_{i in text{length}(s)} 31^{length(s) - i - 1}*s[i] right) mod 2^{32}\]

There's a class of mathematical functions called one-way functions. A one way function is a function \(f\), where given \(x\), it's easy to compute \(f(x)\), but given \(f(x)\) it's extremely difficult (or even impossible) to compute \(x\).

If we combine those two, we have what's called a crpytogrphic hash function: a function that takes an arbitrary input string, and converts it to a number, in a way where it's very difficult to figure out what the input string that produced the number was. That's great for storing passwords! We don't store the password at all. We just store the hash-value produced from the password. Then when a user comes and logs in, we take their password, hash it, and compute the result to the stored hash.

Instead of the file with explicit passwords, we get something like:

alice:7a2d28fc
mark:dfd4e1c6
joe:ed849ee1
jen:bb76e739

This is much better than storing the encrypted password. There is no encryption key that a thief can use to decrypt the password. Even if a thief knows the hash values of your user's passwords, they can't get in to the system! And your system actually never stores the actual values of the user's passwords - just their hashcodes!

So again, let's look at this from the perspective of a thief. How can a thief break into a system with hashed passwords?

If they don't know what hash function you're using, then they're completely stuck. Sadly, they can probably figure it out. Designing new crpytographic hash functions is hard. Implementing cryptographic hash functions correctly is hard. As a result, most people just use a hash function from a library. That means that for a thief, it's usually pretty easy to figure out what hash function is being used by a system.

Once they know what hash function you used, their only choice to break your system is to try to guess the passwords. That is, they can guess passwords, compute their hash codes, and search through your password file to see if any of the users password hashes matches. If they find one, they're gold!

In fact, there's a common strategy based on this idea called a rainbow table. A rainbox table is a list of common passwords, and the numeric value that they hash to with a common crptographic hash value. Something like:

Password String Hash value
pass 1b93eb12
password a4532c47
abc 7a2d28fc
... ...

If you can somehow steal the passwords file, then with a rainbow table, you can find users with common passwords. For example, in the table above, you can see that the hashcode "7a2d28fc" occurs in the passwords file for the username "alice", and it's also in the rainbow table for the password "abc". So a thief could determing that alice's password was "abc". Even with the best crpytographic hash, all it takes is one idiot user who uses "password" as their password, and your system's security is breached.

Salting passwords addresses that problem. In a salting strategy, you don't hash a user's password by itself: you combine it with some additional data, and then hash that combination. The additional information is called the salt..

You can use lots of different things for the salt. There's a complex set of tradeoffs in the exact salting strategy, which are beyond the scope of this post, but a few examples include:

  1. Always use a fixed salt string. This is weak, but better than nothing. It's got a similar weakness to the encrypted password system: you only need one salt to give you a handle on breaking all of the passwords, and that one salt needs to be in the system.
  2. Add a random piece of data for each password. The catch here is that you need to store the salt data for each password. This is what unix passwords used to use. They added 12 random bits to each password. In the passwords file, they stored the salt and the hashed password. The weakness of this is that the salt is right there with the password entry. But because each user has a different salt, that means that any attempt to breach the system needs to look at each user separately.
  3. Salt on metadata: that is, take information about the user that isn't part of their username, and use that as the salt. For example, you could use a person's birthday as the salt for their account.

If each user has a different salt, then even if you've got terrible passwords, a thief needs to do a lot of work to try to break your system. Even with a rainbow-table like strategy, they can't compute the hashcode for a given common password once, and then search the password hash list for that code - they need to recompute it for each possible salt value!

What salting does is, effectively, increase the amount of effort needed to break the passwords. If you add 12 bits of salt, then a rainbow table needs 4096 times more entries to find common passwords! If your salt is long enough, then it can make it effectively impossible to create a rainbox table at all. If they try to attack you without a rainbow table, a 12 bit salt means that your attacker needs to attack the passwords of each of your users seperately! Even if they know the value of the salt, you've made it much harder for them to breach your security.

No responses yet

New Dimensions of Crackpottery

(by MarkCC) Feb 26 2013

I have, in the past, ranted about how people abuse the word "dimension", but it's been a long time. One of my followers on twitter sent me a link to a remarkable piece of crackpottery which is a great example of how people simply do not understand what dimensions are.

There are several ways of defining "dimension" mathematically, but they all come back to one basic concept. A dimension an abstract concept of a direction. We can use the number of dimensions in a space as a way of measuring properties of that space, but those properties all come back to the concept of direction. A dimension is neither a place nor a state of being: it is a direction.

Imagine that you're sitting in an abstract space. You're at one point. There's another point that I want you to go to. In order to uniquely identify your destination, how many directions do I need to mention?

If the space is a line, you only need one: I need to tell you the distance. There's only one possible direction that you can go, so all I need to tell you is how far. Since you only need one direction, the line is one-dimensional.

If the line is a plane, then I need to tell you two things. I could do that by saying "go right three steps then up 4 steps", or I could say "turn 53 degrees clockwise, and then walk forward 5 steps." But there's no way I can tell you how to get to your destination with less than two directions. You need two directions, so the plane is two dimensional.

If the space is the interior of a cube, then you'll need three directions, which means that the cube is three dimensional.

On to the crackpottery!

E=mc2 represents a translation across dimensions, from energy to matter.

No, it does not. Energy and matter are not dimensions. \(e=mc^2\) is a statement about the fundamental relation between energy and matter, not a statement about dimensions. Our universe could be 2 dimensional, 3 dimensional, 4 dimensional, or 22 dimensional: relativity would still mean the same thing, and it's not a statement about a "translation across dimensions".

Energy can travel at the speed of light, and as Special Relativity tells us, from the perspective of light speed it takes no time to travel any distance. In this way, energy is not bound by time and space the way matter is. Therefore, it is in a way five-dimensional, or beyond time.

Bzzt, no.

Energy does not travel. Light travels, and light can transmit energy, but light isn't energy. Or, from another perspective, light is energy: but so is everything else. Matter and energy are the same thing.

From the perspective of light speed time most certainly does pass, and it does take plenty of time to travel a distance. Light takes roughly 6 minutes to get from the sun to the earth. What our intrepid author is trying to talk about here is the idea of time dilation. Time dilation describes the behavior of particles with mass when they move at high speeds. As a massive particle moves faster and approaches the speed of light, the mass of the particle increases, and the particle's experience of time slows. If you could accelerate a massive particle to the speed of light, its mass would become infinite, and time would stop for the particle. "If" is the key word there: it can't. It would require an infinite amount of energy to accelerate it to the speed of light.

But light has no mass. Relativity describes a strange property of the universe, which is hard to wrap your head around. Light always moves at the same speed, no matter your perspective. Take two spacecraft in outer space, which are completely stationary relative to each other. Shine a laser from one, and measure how long it takes for the light to get to the other. How fast is it going? Roughly 186,000 miles/second. Now, start one ship moving away from the other at half the speed of light. Repeat the experiment. One ship is moving away from the other at a speed of 93,000 miles/second. From the perspective of the moving ship, how fast is the light moving away from it towards the other ship? 186,000 miles/second. From the perspective of the stationary ship, how fast is the laser light approaching it? 186,000 miles/second.

It's not that there's some magic thing about light that makes it move while time stops for it. Light is massless, so it can move at the speed of light. Time dilation doesn't apply because it has no mass.

But even if that weren't the case, that's got nothing to do with dimensionality. Dimensionality is a direction: what does this rubbish have to do with the different directions that light can move in? Absolutely nothing: the way he's using the word "dimension" has nothing to do with what dimensions mean.

All “objects” or instances of matter are time-bound; they change, or die, or dissolve, or evaporate. Because they are subject to time, objects can be called four-dimensional.

Nope.

Everything in our universe is subject to time, because time is one of the dimensions in our universe. Time is a direction that we move. We don't have direct control over it - but it's still a direction. When and where did I write this blog post compared to where I am when you're reading it? The only way you can specify that is by saying how far my position has changed in four directions: 3 spatial directions, and time. Time is a dimension, and everything in our universe needs to consider it, because you can't specify anything in our universe without all four dimensions.

The enormous energy that can be released from a tiny object (as in an atomic bomb) demonstrates the role dimensions play in constructing reality.

No: the enormous energy that can be released from a tiny object demonstrates the fact that a small quantity of matter is equivalent to a large quantity of energy. As you'd expect if you look at that original equation: \(e=mc^2\). A gram of mass - something the size of a paperclip - is equivalent to about 25 million kilowatt-hours of energy - or more than the total yearly energy use of 1,200 average americans. That's damned impressive and profound, without needing to draw in any mangled notions of dimensions or magical dimensional powers.

Higher dimensions are mind-blowingly powerful; even infinitely so. Such power is so expansive that it can’t have form, definition, or identity, like a ball of uranium or a human being, without finding expression in lower dimensions. The limitations of time and space allow infinite power to do something other than constantly annihilate itself.

Do I even need to respond to this?

Einstein’s equation E=mc2 bridges the fourth and the fifth dimensions, expressed as matter and energy. Imagine a discovery that bridges expressions of the fifth and sixth dimensions, such as energy and consciousness. Consciousness has the five-dimensional qualities of energy, but it can’t be “spent” in the way energy can because it doesn’t change form the way energy does. Therefore, it’s limitless.

And now we move from crackpottery to mysticism. Einstein's mass-energy equation doesn't bridge dimensions, and dimensionality has nothing do with mass-energy equivalence. And now our crackpot friend suddenly throws in another claim, that consciousness is the sixth dimension? Or consciousness is the bridge between the fifth and sixth dimensions? It's hard to figure out just what he's saying here, except for the fact that it's got nothing to do with actual dimensions.

Is there a sixth dimension? Who knows? According to some modern theories, our universe actually has many more than the 4 dimensions that we directly experience. There could be 6 or 10 or 20 dimensions. But if there are, those dimensions are just other directions that things can move. They're not abstract concepts like "consciousness".

And of course, this is also remarkably sloppy logic:

  1. Consciousness has the 5-dimensional qualities of energy
  2. Consciousness can't be spent.
  3. Consciousness can't change form.
  4. Therefore consciousness is unlimited.

The first three statements are just blind assertions, given without evidence or argument. The fourth is presented as a conclusion drawn from the first three - but it's a non-sequitur. There's no real way to conclude the last statement given the first three. Even if you give him all the rope in the world, and accept those three statements as axioms - it's still garbage.

No responses yet

The Intellectual Gravity of Brilliant Baseball Players

(by MarkCC) Feb 21 2013

Some of my friends at work are baseball fans. I totally don't get baseball - to me, it's about as interesting as watching paint dry. But thankfully, some of my friends disagree, which is how I found this lovely little bit of crackpottery.

You see, there's a (former?) baseball player named Jose Canseco, who's been plastering twitter with his deep thoughts about science.

</script

At first glance, this is funny, but not particularly interesting. I mean, it's a classic example of my mantra: the worst math is no math.

The core of this argument is pseudo-mathematical. The dumbass wants to make the argument that under current gravity, it wouldn't be possible for things the size of the dinosaurs to move around. The problem with this argument is that there's no problem! Things the size of dinosaurs could move about in current gravity with absolutely no difficult. If you actually do the math, it's fine.

If dinosaurs had the anatomy of human beings, then it's true that if you scaled them up, they wouldn't be able to walk. But they didn't. They had anatomical structures that were quite different from ours in order to support their massive size. For example, here's a bone from quetzlcoatlus:

Media,111639,en See the massive knob sticking out to the left? That's a muscle attachement point. That gave the muscles much greater torque than ours have, which they needed. (Yes, I know that Quetzalcoatlus wwasn't really a dinosaur, but it is one of the kinds of animals that Canseco was talking about, and it was easy to find a really clear image.)

Most animal joints are, essentially, lever systems. Muscles attach to two different bones, which are connected by a hinge. The muscle attachement points stick out relative to the joint. When the muscles contract, that creates a torque which rotate the bones around the joint.

The lever is one of the most fundamental machines in the universe. It operates by the principal of torque. Our regular daily experiences show that levers act in a way that magnifies our efforts. I can't walk up to a car and lift it. But with a lever, I can. Muscle attachment points are levers. Take another look at that bone picture: what you're seeing is a massive level to magnify the efforts of the muscles. That's all that a large animal needed to be able to move around in earths gravity.

This isn't just speculation - this is stuff that's been modeled in great detail. And it's stuff that can be observed in modern day animals. Look at the skeleton of an elephant, and compare it to the skeleton of a dog. The gross structure is very similar - they are both quadripedal mammals. But if you look at the bones, the muscle attachment points in the elephants skeleton have much larger projections, to give the muscles greater torque. Likewise, compare the skeleton of an american robin with the skeleton of a mute swan: the swan (which has a maximum recorded wingspan of 8 feet!) has much larger projections on the attachment points for its muscles. If you just scaled a robin from its 12 inch wingspan to the 8 feet wingspan of a swan, it wouldn't be able to walk, much less fly! But the larger bird's anatomy is different in order to support its size - and it can and does fly with those 8 foot wings!

That means that on the basic argument for needing different gravity, Canseco fails miserably.

Canseco's argument for how gravity allegedly changed is even worse.

What he claims is that at the time when the continental land masses were joined together as the pangea supercontinent, the earths core moved to counterbalance the weight of the continents. Since the earths core was, after this shift, farther from the surface, the gravity at the surface would be smaller.

This is an amusingly ridiculous idea. It's even worse that Ted Holden and his reduced-felt-gravity because of the electromagnetic green saturn-star.

First, the earths core isn't some lump of stuff that can putter around. The earth is a solid ball of material. It's not like a ball of powdered chalk with a solid lump of uranium at the center. The core can't move.

Even if it could, Canseco is wrong. Canseco is playing with two different schemes of how gravity works. We can approximate the behavior of gravity on earth by assuming that the earth is a point: for most purposes, gravity behaves almost as if the entire mass of the earth was concentrated at the earths center of mass. Canseco is using this idea when he moves the "core" further from the surface. He's using the idea that the core (which surrounds the center of mass in the real world) is the center of mass. So if the core moves, and the center of mass moves with it, then the point-approximation of gravity will change because the distance from the center of mass has increased.

But: the reason that he claims the core moved is because it was responding to the combined landmasses on the surface clumping together as pangea. That argument is based on the idea that the core had to move to balance the continents. In that case, the center of gravity wouldn't be any different - if the core could move to counterbalance the continents, it would move just enough to keep the center of gravity where it was - so if you were using the point approximation of gravity, it would be unaffected by the shift.

He's combining incompatible assumptions. To justify moving the earths core, he's *not* using a point-model of gravity. He's assuming that the mass of the earths core and the mass of the continents are different. When he wants to talk about the effect of gravity of an animal on the surface, he wants to treat the full mass of the earth as a point source - and he wants that point source to be located at the core.

It doesn't work that way.

The thing that I find most interesting about this particular bit of crackpottery isn't really about this particular bit of crackpottery, but about the family of crackpottery that it belongs to.

People are fascinated by the giant creatures that used to live on the earth. Intuitively, because we don't see giant animals in the world around us, there's a natural tendency to ask "Why?". And being the pattern-seekers that we are, we intuitively believe that there must be a reason why the animals back then were huge, but the animals today aren't. It can't just be random chance. So people keep coming up with reasons. Like:

  1. Neal Adams: who argues that the earth is constantly growing larger, and that gravity is an illusion caused by that growth. One of the reasons, according to his "theory", for why we know that gravity is just an illusion, is because the dinosaurs supposedly couldn't walk in current gravity.
  2. Ted Holden and the Neo-Velikovskians: who argue that the solar system is drastically different today than it used to be. According to Holden, Saturn used to be a "hyperintelligent green electromagnetic start", and the earth used to be tide-locked in orbit around it. As a result, the felt effect of gravity was weaker.
  3. Stephen Hurrell, who argues similarly to Neal Adams that the earth is growing. Hurrell doesn't dispute the existence of gravity the way that Adams does, but similarly argues that dinosaurs couldn't walk in present day gravity, and resorts to an expanding earth to explain how gravity could have been weaker.
  4. Ramin Amir Mardfar: who claims that the earth's mass has been continually increasing because meteors add mass to the earth.
  5. Gunther Bildmeyer, who argues that gravity is really an electromagnetic effect, and so the known fluctuations in the earths magnetic fields change gravity. According to him, the dinosaurs could only exist because of the state of the magnetic field at the time, which reduced gravity.

There are many others. All of them grasping at straws, trying to explain something that doesn't need explaining, if only they'd bother to do the damned math, and see that all it takes is a relatively small anatomical change.

No responses yet

Euler's Equation Crackpottery

(by MarkCC) Feb 18 2013

One of my twitter followers sent me an interesting piece of crackpottery. I debated whether to do anything with it. The thing about crackpottery is that it really needs to have some content. Total incoherence isn't amusing. This bit is, frankly, right on the line.

Euler's Equation and the Reality of Nature.

a) Euler's Equation as a mathematical reality.

Euler's identity is "the gold standard for mathematical beauty'.
Euler's identity is "the most famous formula in all mathematics".
‘ . . . this equation is the mathematical analogue of Leonardo
da Vinci’s Mona Lisa painting or Michelangelo’s statue of David’
‘It is God’s equation’, ‘our jewel ‘, ‘ It is a mathematical icon’.
. . . . etc.

b) Euler's Equation as a physical reality.

"it is absolutely paradoxical; we cannot understand it,
and we don't know what it means, . . . . .’
‘ Euler's Equation reaches down into the very depths of existence’
‘ Is Euler's Equation about fundamental matters?’
‘It would be nice to understand Euler's Identity as a physical process
using physics.‘
‘ Is it possible to unite Euler's Identity with physics, quantum physics ?’

My aim is to understand the reality of nature.

Can Euler's equation explain me something about reality?

To give the answer to this. question I need to bind Euler's equation with an object – particle. Can it be math- point or string- particle or triangle-particle? No, Euler's formula has quantity (pi) which says me that the particle must be only a circle .

Now I want to understand the behavior of circle - particle and therefore I need to use spatial relativity and quantum theories. These two theories say me that the reason of circle – particle’s movement is its own inner impulse (h) or (h*=h/2pi).

a) Using its own inner impulse (h) circle - particle moves ( as a wheel) in a straight line with constant speed c = 1. We call such particle - ‘photon’. From Earth – gravity point of view this speed is maximally. From Vacuum point of view this speed is minimally. In this movement quantum of light behave as a corpuscular (no charge).

b) Using its own inner impulse / intrinsic angular momentum ( h* = h / 2pi ) circle - particle rotates around its axis. In such movement particle has charge, produce electric waves ( waves property of particle) and its speed ( frequency) is : c.

1. We call such particle - ‘ electron’ and its energy is: E=h*f.

In this way I can understand the reality of nature.

==.

Best wishes.

Israel Sadovnik Socratus.

Euler's equation says that \(e^{ipi} + 1 = 0\). It's an amazingly profound equation. The way that it draws together fundamental concepts is beautiful and surprising.

But it's not nearly as mysterious as our loonie-toon makes it out to be. The natural logarithm-base is deeply embedded in the structure of numbers, and we've known that, and we've known how it works for a long time. What Euler did was show the relationship between e and the fundamental rotation group of the complex numbers. There are a couple of ways of restating the definition of that make the meaning of that relationship clearer.

For example:

\[e^z = lim_{nrightarrow infty}(1 + frac{z}{n})^n\]

That's an alternative definition of what e is. If we use that, and we plug \(ipi\) into it, we get:

\[e^{ipi} = lim_{n rightarrow infty}(1+frac{ipi}{n})^n\]

If you work out that limit, it's -1. Also, if you take values of N, and plot \((1 + frac{ipi}{n})^1\), \((1+frac{ipi}{n})^2\), \((1 + frac{ipi}{n})^3\), and \((1 + frac{ipi}{n})^4\), ... on the complex plane, as N gets larger, the resulting curve gets closer and closer to a semicircle.

An equivalent way of seeing it is that exponents of \(e^i\) are rotations in the complex number plane. The reason that \(e^{ipi} = -1\) is because if you take the complex number (1 + 0i), and rotate it by \(pi\) radians, you get -1: \(1(e^{ipi}) = -1\).

That's what Euler's equation means. It's amazing and beautiful, but it's not all that difficult to understand. It's not mysterious in the sense that our crackpot friend thinks it is.

But what really sets me off is the idea that it must have some meaning in physics. That's silly. It doesn't matter what the physical laws of the universe are: the values of \(pi\) and e will not change. It's like trying to say that there must be something special about our universe that makes 1 + 1 = 2 - or, conversely, that the fact that 1+1=2 means something special about the universe we live in. These things are facts of numbers, which are independent of physical reality. Create a universe with different values for all of the fundamental constants - e and π will be exactly the same. Create a universe with less matter - e and π will still be the same. Create a universe with no matter, a universe with different kinds of matter, a universe with 300 forces instead of the four that we see - and e and π won't change.

What things like e and π, and their relationship via Euler's equation tell us is that there's a fundamental relationship between numbers and shapes on a two-dimensional plane which does not and cannot really exist in the world we live in.

Beyond that, what he's saying is utter rubbish. For example:

These two theories say me that the reason of circle – particle’s movement is its own inner impulse (h) or (h*=h/2pi). Using its own inner impulse (h) circle - particle moves ( as a wheel) in a straight line with constant speed c = 1. We call such particle - ‘photon’. From Earth – gravity point of view this speed is maximally. From Vacuum point of view this speed is minimally. In this movement quantum of light behave as a corpuscular (no charge).

This is utterly meaningless. It's a jumble of words that pretends to be meaningful and mathematical, when in fact it's just a string of syllables strung together nonsensical ways.

There's a lot that we know about how photons behave. There's also a lot that we don't know about photons. This word salad tells us exactly nothing about photons. In the classic phrase, it's not even wrong: what it says doesn't have enough meaning to be wrong. What is the "inner impulse" of a photon according to this crackpot? We can't know: the term isn't defined. We are pretty certain that a photon is not a wheel rolling along. Is that what the crank is saying? We can't be sure. And that's the problem with this kind of crankery.

As I always say: the very worst math is no math. This is a perfect example. He starts with a beautiful mathematical fact. He uses it to jump to a completely non-mathematical conclusion. But he writes a couple of mathematical symbols, to pretend that he's using math.

No responses yet

The Meta of Gödel

(by MarkCC) Feb 17 2013

As you may be figuring out, there's a reason why I resisted walking through Gödel's proof of incompleteness for so long. Incompeteness isn't a simple proof!

To refresh your memory, here's a sketch of the proof:

  1. Take a simple logic. We've been using a variant of the Principia Mathematica's logic, because that's what Gödel used.
  2. Show that any statement in the logic can be encoded as a number using an arithmetic process based on the syntax of the logic. The process of encoding statements numerically is called Gödel numbering.
  3. Show that you can express meta-mathematical properties of logical statements in terms of arithemetic properties of their Gödel numbers. In particular, we need to build up the logical infrastructure that we need to talk about whether or not a statement is provable.
  4. Using meta-mathematical properties, show how you can create an unprovable statement encoded as a Gödel number.

What we've done so far is the first two steps, and part of the third. In this post, we saw the form of the Principia's logic that we're using, and how to numerically encode it as a Gödel numbering. We've start started on the third point in this post, by figuring out just what it means to say that things are encoded arithmetically. Now we can get to the part where we see how to encode meta-mathematical properties in terms of arithmetic properties of the Gödel numbering. In this post, we're going to build up everything we need to express syntactic correctness, logical validity, and provability in terms of arithmetical properties of Gödel numbers. (And, as a reminder, I've been using this translation on Gödel's original paper on incompleteness.)

This is the most complex part of the incompleteness proof. The basic concept of what we're doing is simple, but the mechanics are very difficult. What we want to do is define a set of predicates about logical statements, but we want those predicates to be expressed as arithmetic properties of the numerical representations of the logical statements.

The point of this is that we're showing that done in the right way, arithmetic is logic - that doing arithmetic on the Gödel numbers is doing logical inference. So what we need to do is build up a toolkit that shows us how to understand and manipulate logic-as-numbers using arithmetic. As we saw in the last post, primitive recursion is equivalent to arithmetic - so if we can show how all of the properties/predicates that we define are primitive recursive, then they're arithmetic.

This process involves a lot of steps, each of which is building the platform for the steps that follow it. I struggled quite a bit figuring out how to present these things in a comprehensible way. What I ended up with is writing them out as code in a pseudo-computer language. Before inventing this language, I tried writing actual executable code, first in Python and then in Haskell, but I wasn't happy with the clarity of either one.

Doing it in an unimplemented language isn't as big a problem as you might think. Even if this was all executable, you're not going to be able to actually run any of it on anything real - at least not before you hair turns good and gray. The way that this stuff is put together is not what any sane person would call efficient. But the point isn't to be efficient: it's to show that this is possible. This code is really all about searching; if we wanted to be efficient, this could all be done in a different representation, with a different search method that was a lot faster - but that wolud be harder to understand.

So, in the end, I threw together a simple language that's easy to read. This language, if it were implemented, wouldn't really even be Turing complete - it's a primitive recursive language.

Basics

We'll start off with simple numeric properties that have no obvious connection to the kinds of meta-mathematical statements that we want to talk about, but we'll use those to define progressively more and more complex and profound properties, until we finally get to our goal.

# divides n x == True if n divides x without remainder.
pred divides(n, x) = x mod n == 0

pred isPrime(0) = False
pred isPrime(1) = False
pred isPrime(2) = True
pred isPrime(n) = {
  all i in 2 to n {
    not divides(i, n)
  }
}

fun fact(0) = 1
fun fact(n) = n * fact(n - 1)

Almost everything we're going to do here is built on a common idiom. For anything we want to do arithmetically, we're going to find a bound - a maximum numeric value for it. Then we're going to iterate over all of the values smaller than that bound, searching for our target.

For example, what's the nth prime factor of x? Obviously, it's got to be smaller than x, so we'll use x as our bound. (A better bound would be the square root of x, but it doesn't matter. We don't care about efficiency!) To find the nth prime factor, we'll iterate over all of the numbers smaller than our bound x, and search for the smallest number which is prime, which divides x, and which is larger than the n-1th prime factor of x. We'll translate that into pseudo-code:

fun prFactor(0, x) = 0
fun prFactor(n, x) = {
  first y in 1 to x {
    isPrime(y) and divides(y, x) and prFactor(n - 1, x) < y
  }
}

Similarly, for extracting values from strings, we need to be able to ask, in general, what's the nth prime number? This is nearly identical to prFactor above. The only difference is that we need a different bound. Fortunately, we know that the nth prime number can't be larger than the factorial of the previous prime plus 1.

fun nthPrime(0) = 0
fun nthPrime(n) = {
  first y in 1 to fact(nthPrime(n - 1)) + 1  {
    isPrime(y) and y > nthPrime(n - 1))
  }
}

In composing strings of Gödel numbers, we use exponentiation. Given integers x and n, xn, we can obviously compute them via primitive recursion. I'll define them below, but in the rest of this post, I'll write them as an operator in the language:

fun pow(n, 0) = 1
fun pow(n, i) = n * pow(n, i - 1)

String Composition and Decomposition

With those preliminaries out of the way, we can get to the point of defining something that's actually about one of the strings encoded in these Gödel numbers. Given a number n encoding a string, item(n, x) is the value of the nth character of x. (This is slow. This is really slow! We're getting to the limit of what a very powerful computer can do in a reasonable amount of time. But this doesn't matter. The point isn't that this is a good way of doing these things: it's that these things are possible. To give you an idea of just how slow this is, I started off writing the stuff in this post in Haskell. Compiled with GHC, which is a very good compiler, item to extract the 6th character of an 8 character string took around 10 minutes on a 2.4Ghz laptop.)

fun item(n, x) = {
  first y in 1 to x {
    divides(prFactor(n, x) ** y, y) and
      not divides(prFactor(n, x)**(y+1), x)
  }
}

Given a string, we want to be able to ask how long it is; and given two strings, we want to be able to concatenate them.

fun length(x) = {
  first y in 1 to x {
    prFactor(y, x) > 0 and prFactor(y + 1, x) == 0
  }
}

fun concat(x, y) = {
  val lx = length(x)
  val ly = length(y)

  first z in 1 to nthprime(lx + ly)**(x + y) {
    (all n in 1 to lx {
        item(n, z) == item(n, x)
     }) and (all n in 1 to ly {
        item(n + lx, z) == item(n, y)
      })
  }
}

fun concatl([]) = 0
fun concatl(xs) = {
  concat(head(xs), concatl(tail(xs)))
}

fun seq(x) = 2**x

We want to be able to build statements represented as numbers from other statements represented as numbers. We'll define a set of functions that either compose new strings from other strings, and to check if a particular string is a particular kind of syntactic element.

# x is a variable of type n.
pred vtype(n, x) = {
  some z in 17 to x {
    isPrime(z) and x == n**z
  }
}

# x is a variable
pred isVar(x) = {
  some n in 1 to x {
    vtype(n, x)
  }
}

fun paren(x) =
  concatl([gseq(11), x, gseq(13)])

# given the Gödel number for a statement x, find
# the Gödel number for not x.
fun gnot(x) =
  concat(gseq(5), paren(x))

# Create the number for x or y.
fun gor(x, y) =
  concatl([paren(x), seq(7), paren(y)])

# Create the number for 'forall x(y)'.
fun gforall(x, y) =
  concatl([seq(9), seq(x), paren(y)])

# Create the number for x with n invocations of the primitive
# successor function.
fun succn(0, x) = x
fun succn(n, x) = concat(seq(3), succn(n - 1, x))

# Create the number n using successor and 0.
fun gnumber(n) = succn(n, seq(1))

# Check if a statement is type-1.
pred stype_one(x) = {
  some m in 1 to x {
     m == 1 or (vtype(1, m) and x == succn(n, seq(m))
  }
}

# Check if a statement is type n.
pred fstype(1, x) = stype_one(x)
pred fstype(n, x) =
  some v in 1 to x {
    vtype(n, v) and R(v)
  }
}

That last function contains an error: the translation of Gödel that I'm using says R(v) without defining R. Either I'm missing something, or the translator made an error.

Formulae

Using what we've defined so far, we're now ready to start defining formulae in the basic Principia logic. Forumlae are strings, but they're strings with a constrained syntax.

pred elFm(x) = {
  some y in 1 to x {
    some z in 1 to x {
      some n in 1 to x {
        stype(n, y) and stype(n+1, z) and x == concat(z, paren(y))
      }
    }
  }
}

All this is doing is expressing the grammar rule in arithmetic form: an elementary formula is a predicate: P(x), where x is a variable on level n, and P is a variable of level x + 1.

The next grammar rule that we encode this way says how we can combine elementary formulae using operators. There are three operators: negation, conjunction, and universal quantification.

pred op(x, y, z) = {
  x == gnot(y) or
  x == gor(y, z) or
  (some v in 1 to x { isVar(v) and x == gforall(v, y) })
}

And now we can start getting complex. We're going to define the idea of a valid sequence of formulae. x is a valid sequence of formulae when it's formed from a collection of formulae, each of which is either an elementary formula, or is produced from the formulae which occured before it in the sequence using either negation, logical-or, or universal quantification.

In terms of a more modern way of talking about it, the syntax of the logic is a grammar. A formula sequence, in this system, is another way of writing the parse-tree of a statement: the sequence is the parse-tree of the last statement in the sequence.

pred fmSeq(x) = {
  all p in 0 to length(x) {
    elFm(item(n, x)) or
      some p in 0 to (n - 1) {
        some q in 0 to (n - 1) {
          op(item(n,x), item(p, x), item(q, x))
        }
      }
  }
}

The next one bugs me, because it seems wrong, but it isn't really! It's a way of encoding the fact that a formula is the result of a well-defined sequence of formulae. In order to ensure that we're doing primitive recursive formulae, we're always thinking about sequences of formulae, where the later formulae are produced from the earlier ones. The goal of the sequence of formula is to produce the last formula in the sequence. What this predicate is really saying is that a formula is a valid formula if there is some sequence of formulae where this is the last one in the sequence.

Rephrasing that in grammatical terms, a string is a formula if there is valid parse tree for the grammar that produces the string.

pred isFm(x) = {
  some n in 1 to nthPrime(length(x)**2)**(x*length(x)**2) {
    fmSeq(n)
  }
}

So, now, can we say that a statement is valid because it's parsed according to the grammar? Not quite. It's actually a familiar problem for people who write compilers. When you parse a program in some language, the grammar doesn't usually specify variables must be declared before they're used. It's too hard to get that into the grammar. In this logic, we've got almost the same problem: the grammar hasn't restricted us to only use bound variables. So we need to have ways to check whether a variable is bound in a Gödel-encoded formula, and then use that to check the validity of the formula.

# The variable v is bound in formula x at position n.
pred bound(v, n, x) = {
  isVar(v) and isFm(x) and
  (some a in 1 to x {
    some b in 1 to x {
      some c in 1 to x {
        x == concatl([a, gforall(v, b), c]) and
        isFm(b) and
        length(a) + 1 ≤ n ≤ length(a) + length(forall(v, b))
      }
    }
  })
}

# The variable v in free in formula x at position n
pred free(v, n, x) = {
  isVar(v) and isFm(x) and
  (some a in 1 to x {
    some b in 1 to x {
      some c in 1 to x {
        v == item(n, x) and n ≤ length(x) and not bound(v, n, x)
      }
    }
  })
}

pred free(v, x) = {
  some n in 1 to length(x) {
    free(v, n, x)
  }
}

To do logical inference, we need to be able to do things like replace a variable with a specific infered value. We'll define how to do that:

# replace the item at position n in x with y.
fun insert(x, n, y) = {
  first z in 1 to nthPrime(length(x) + length(y))**(x+y) {
    some u in 1 to x {
      some v in 1 to x {
        x == concatl([u, seq(item(n, x)), v]) and
        z == concatl([u, y, v]) and
        n == length(u) + 1
      }
    }
  }
}

There are inference operations and validity checks that we can only do if we know whether a particular variable is free at a particular position.

# freePlace(k, v, k) is the k+1st place in x (counting from the end)
# where v is free.
fun freePlace(0, v, x) = {
  first n in 1 to length(x) {
    free(v, n, x) and
    not some p in n to length(x) {
      free(v, p, x)
    }
  }
}

fun freePlace(k, v, x) = {
  first n in 1 to freePlace(n, k - 1, v) {
    free(v, n, x) and
    not some p in n to freePlace(n, k - 1, v) {
      free(v, p, x)
    }
  }
}

# number of places where v is free in x
fun nFreePlaces(v, x) = {
  first n in 1 to length(x) {
    freeplace(n, v, x) == 0
  }
}

In the original logic, some inference rules are defined in terms of a primitive substitution operator, which we wrote as subst[v/c](a) to mean substitute the value c for the variable c in the statement a. We'll build that up on a couple of steps, using the freePlaces function that we just defined.

# Subst1 replaces a single instance of v with y.
fun subst'(0, x, v, y) = x
fun subst1(0k, x, v, y) =
  insert(subst1(k, x, v, y), freePlace(k, v, x), y)

# subst replaces all instances of v with y
fun subst(x, v, y) = subst'(nFreePlaces(v, x), x, v, y)

The next thing we're going to do isn't, strictly speaking, absolutely necessary. Some of the harder stuff we want to do will be easier to write using things like implication, which aren't built in primitive of the Principia logic. To write those as clearly as possible, we'll define the full suite of usual logical operators in terms of the primitives.

# implication
fun gimp(x, y) = gor(gnot(x), y)

# logical and
fun gand(x, y) = gnot(gor(gnot(x), gnot(y)))

# if/f
fun gequiv(x, y) = gand(gimp(x, y), gimp(y, x))

# existential quantification
fun gexists(v, y) = not(gforall(v, not(y)))

Axioms

The Peano axioms are valid logical statements, so they have Gödel numbers in this system. We could compute their value, but why bother? We know that they exist, so we'll just give them names, and define a predicate to check if a value matches them.

The form of the Peano axioms used in incompleteness are:

  1. Zero: ¬(succ(x1) = 0)
  2. Uniqueness: succ(x1) = succ(y1) Rightarrow x = y
  3. Induction: x2(0) ∧ ∀x1(x2(x1)⇒ x2(succ(x1))) ⇒ ∀x1(x2(x1))
const pa1 = ...
const pa2 = ...
const pa3 = ...

pred peanoAxiom(x) =
  (x == pa1) or (x == pa2) or (x == pa3)

Similarly, we know that the propositional axioms must have numbers. The propositional
axioms are:

  1. \(p lor p Rightarrow p\)
  2. \(p Rightarrow p lor q\)
  3. \(p lor q Rightarrow p lor q\)
  4. \((p Rightarrow q) Rightarrow (r lor p Rightarrow r lor q)\)

I'll show the translation of the first - the rest follow the same pattern.

# Check if x is a statement that is a form of propositional
# axiom 1: y or y => y
pred prop1Axiom(x) =
  some y in 1 to x {
    isFm(x) and x == imp(or(y, y), y)
  }
}

pred prop2Axiom(x) = ...
pred prop3Axiom(x) = ...
pred prop4Axiom(x) = ...
pred propAxiom(x) = prop2Axiom(x) or prop2Axiom(x) or
    prop3Axiom(x) or prop4Axiom(x)

Similarly, all of the other axioms are written out in the same way, and we add a predicate isAxiom to check if something is an axiom. Next is quantifier axioms, which are complicated, so I'll only write out one of them - the other follows the same basic scheme.

The two quantifier axioms are:

  1. \(forall v(a) Rightarrow text{subst}[v/c](a)\)
  2. \(forall v(b lor a) Rightarrow (b or forall v(a))\)
quantifier_axiom1_condition(z, y, v) = {
  not some n in 1 to length(y) {
    some m in 1 to length(z) {
      some w in 1 to z {
         w == item(m, z) and bound(w, n, y) and free(v, n, y)
      }
    }
  }
}

pred quantifier1Axiom(x) = {
  some v in 1 to x {
    some y in 1 to x {
      some z in 1 to x {
        some n in 1 to x {
          vtype(n, v) and stype(n, z) and
          isFm(y) and
          quantifier_axiom1_condition(z, y, v) and
          x = gimp(gforall(v, y), subst(y, v, z))
        }
      }
    }
  }
}

quanitifier_axiom2 = ...
isQuantifierAxiom = quantifier1Axiom(x) or quantifier2Axiom(x)

We need to define a predicate for the reducibility axiom (basically, the Principia's version of the ZFC axiom of comprehension). The reducibility axiom is a schema: for any predicate \(a\), \(exists u (forall v (u(v) Leftrightarrow a\). In our primitive recursive system, we can check if something is an instance of the reducibility axiom schema with:

pred reduAxiom(x) =
  some u in 1 to x {
    some v in 1 to x {
      some y in 1 to x {
        some n in 1 to x {
          vtype(n, v) and
          vtype(n+1, u) and
          not free(u, y) and
          isFm(y) and
          x = gexists(u, gforall(v, gequiv(concat(seq(u), paren(seq(v))), y)))
        }
      }
    }
  }
}

Now, the set axiom. In the logic we're using, this is the axiom that defines set equality. It's written as \(forall x_1 (x_2(x_1) Leftrightarrow y_2(y_1) Rightarrow x_2 = x_1)\). Set equality is defined for all types of sets, so we need to have one version of axiom for each level. We do that using type-lifting: we say that the axiom is true for type-1 sets, and that any type-lift of the level-1 set axiom is also a version of the set axiom.

fun typeLift(n, x) = {
  first y in 1 to x**(x**n) {
    all k in 1 to length(x) {
      item(k, x) ≤ 13 and item(k, y) == item(k, v) or
      item(k, x) > 13 and item(k, y) = item(k, x) * prFactor(1, item(k, x))**n
    }
  }
}

We haven't defined the type-1 set axiom. But we just saw the axiom above, and it's obviously a simple logical statement. That mean that it's got a Gödel number. Instead of computing it, we'll just say that that number is called sa1. Now we can define a predicate to check if something is a set axiom:

val sa1 = ...
pred setAxiom(x) =
  some n in 1 to x {
    x = typeLift(n, sa)
  }
}

We've now defined all of the axioms of the logic, so we can now create a general predicate to see if a statement fits into any of the axiom categories:

pred isAxiom(x) =
  peanoAxiom(x) or propAxiom(x) or quantifierAxom(x) or
  reduAxiom(x) or setAxiom(x)

Proofs and Provability!

With all of the axioms expressible in primitive recursive terms, we can start on what it means for something to be provable. First, we'll define what it means for some statement x to be an immediate consequence of some statements y and z. (Back when we talked about the Principia's logic, we said that x is an immediate consequence of y and z if either: y is the formula z ⇒ x, or if c is the formula ∀v.x).

pred immConseq(x, y, z) = {
  y = imp(z, x) or
  some v in 1 to x {
    isVar(v) and x = forall(v, y)
  }
}

Now, we can use our definition of an immediate consequence to specify when a sequence of formula is a proof figure. A proof figure is a sequence of statements where each statement in it is either an axiom, or an immediate consequence of two of the statements that preceeded it.

pred isProofFigure(x) = {
  (all n in 0 to length(x) {
    isAxiom(item(n, x)) or
    some p in 0 to n {
      some q in 0 to n {
        immConseq(item(n, x), item(p, x), item(q, x))
      }
    }
  }) and
  length(x) > 0
}

We can say that x is a proof of y if x is proof figure, and the last statement in x is y.

pred proofFor(x, y) =
  isProofFigure(x) and
  item(length(x), x) == y

Finally, we can get to the most important thing! We can define what it means for something to be provable! It's provable if there's a proof for it!

pre provable(x) =
  some y {
    proofFor(y, x)
  }
}

Note that this last one is not primitive recursive! There's no way that we can create a bound for this: a proof can be any length.

At last, we're done with these definition. What we've done here is really amazing: now, every logical statement can be encoded as a number. Every proof in the logic can be encoded as a sequence of numbers: if something is provable in the Principia logic, we can encode that proof as a string of numbers, and check the proof for correctness using nothing but (a whole heck of a lot of) arithmetic!

Next post, we'll finally get to the most important part of what Gödel did. We've been able to define what it means for a statement to be provable - we'll use that to show that there's a way of creating a number encoding the statement that something is not provable. And we'll show how that means that there is a true statement in the Principia's logic which isn't provable using the Principia's logic, which means that the logic isn't complete.

In fact, the proof that we'll do shows a bit more than that. It doesn't just show that the Principia's logic is incomplete. It shows that any consistent formal system like the Principia, any system which is powerful enough to encode Peano arithmetic, must be incomplete.

No responses yet

Defining Properties Arithmetically (part 1): Gödel and Primitive Recursion

(by MarkCC) Feb 11 2013

When I left off, we'd seen how to take statements written in the logic of the Principia Mathematica, and convert them into numerical form. What we need to see now is how to get meta-mathematical.

We want to be able to write second-order logical statements. The basic trick to incompleteness is that we're going to use the numerical encoding of statements to say that a predicate or relation is represented by a number. Then we're going to write predicates about predicates by defining predicates on the numerical representations of the first-order predicates. That's going to let us create a true statement in the logic that can't be proven with the logic.

To do that, we need to figure out how to take our statements and relations represented as numbers, and express properties of those statements and relations in terms of arithmetic. To do that, we need to define just what it means to express something arithmetically. Gödel did that by defining "arithmetically" in terms of a concept called primitive recursion.

I learned about primitive recursion when I studied computational complexity. Nowadays, it's seen as part of theoretical computer science. The idea, as we express it in modern terms, is that there are many different classes of computable functions. Primitive recursion is one of the basic complexity classes. You don't need a Turing machine to compute primitive recursive functions - they're a simpler class.

The easiest way to understand primitive recursion is that it's what you get in a programming language with integer arithmetic, and simple for-loops. The only way you can iterate is by repeating things a bounded number of times. Primitive recursion has a lot of interesting properties: the two key ones for our purposes here are: number theoretic proofs are primitive recursive, and every computation of a primitive recursive function is guaranteed to complete within a bounded amount of time.

The formal definition of primitive recursion, the way that Gödel wrote it, is quite a bit more complex than that. But it means the same thing.

We start with what it means to define a formula via primitive recursion. (Note the language that I used there: I'm not explaining what it means for a function to be primitive recursive; I'm explaining what it means to be defined via primitive recursion.) And I'm defining formulae, not functions. In Gödel's proof, we're always focused on numerical reasoning, so we're not going to talk about programs or algorithms, we're going to about the definition of formulae.

A formula \(phi(x_1, x_2, ..., x_n)\) is defined via primitive recursion if, for some other formulae \(psi\) and \(mu\):

  • Base: \(phi(0, x_2, ..., x_n) = psi(x_2, ..., x_n)\)
  • Recursive: \(phi(i+1, x_2, ..., x_n) = mu(i, phi(i, x_2, ..., x_n), x_2, ..., x_n)\).

So, basically, the first parameter is a bound on the number of times that \(phi\) can invoked recursively. When it's 0, you can't invoke \(phi\) any more.

A formula is primitive recursive if it defined from a collection of formulae \(phi_1, ..., phi_n\) where any formula \(phi_i\) is defined via primitive recursion from \(phi_1, ..., phi_{i-1}\), or the primitive succ function from Peano arithmetic.

For any formula \(phi_i\) in that sequence, the degree of the formula is the number of other primitive recursive formulae used in its definition.

Now, we can define a primitive recursive property: \(R(x_1, ..., x_n)\) is primitive recursive if and only if there exists a primitive recursive function \(phi\) such that \(phi(x_1, ..., x_n) = 0\).

With primitive recursive formulae and relations defined, there's a bunch of theorems about how you can compose primitive recursive formulae and relations:

  1. Every function or relation that you get by substituting a primitive recursive function for a variable in a primitive recursive function/relation is primitive recursive.
  2. If R and S are primitive relations, then ¬R, R∧S, R∨S are all primitive recursive.
  3. If \(phi(x_1, ..., x_n)\) and \(psi(x_1, ..., x_n)\) are primitive recursive functions, then the relation \(R(x_1, ..., x_n) Leftrightarrow (phi(x_1, ..., x_n) = psi(x_1, ..., x_n)\) is also primitive recursive.
  4. Let \(xv\) and \(zv\) be finite-length tuples of variables. If the function \(phi(xv)\) and the relation \(R(y, zv)\) are primitive recursive, then so are the relations:
    • \(S(xv, zv) Leftrightarrow (exists y le phi(xv). R(y, zv))\)
    • \(T(xv, zv) Leftrightarrow (forall y le A(xv). R(y, zv))\)
  5. Let \(xv\) and \(zv\) be finite-length tuples of variables. And let \(text{argmin}[y le f(x).R(x)]\) be the smallest value of \(x\) for which \(y le f(x)\) and \(R(x)\) is true, or 0 if there is no such value. Then if the function \(phi(xv)\) and the relation \(R(y, zv)\) are primitive recursive, then so is the function \(P(xv, zv) = (text{argmin}[y le A(xv). R(y, zv))]\).

By these definitions, addition, subtraction, multiplication, and integer division are all primitive recursive.

Ok. So, now we've got all of that out of the way. It's painful, but it's important. What we've done is come up with a good formal description of what it means for something to be an arithmetic property: if we can write it as a primitive recursive relation or formula, it's arithmetic.

So now, finally, we're ready to get to the really good part! Now that we know what it means to define something arithmetically, we can see how to define meta-mathematical properties and formulae arithmetically. Next post, hopefully tomorrow, I'll start showing you arithmetic expressions of meta-math!

No responses yet

« Newer posts Older posts »