Basics: What is an OS?

(by MarkCC) Dec 05 2013

A reader of this blog apparently likes the way I explain things, and wrote to me to ask a question: what is an operating system? And how does a computer know how to load it?

I'm going to answer that, but I'm going to do it in a roundabout way. The usual answer is something like: "An operating system or OS is a software program that enables the computer hardware to communicate and operate with the computer software." In my opinion, that's a cop-out: it doesn't really answer anything. I'm going to take a somewhat roundabout approach, but hopefully give you an answer that actually explains things in more detail, which should help you understand it better.

When someone like me sets out to write a program, how can we do it? That sounds like an odd question, until you actually think about it. The core of the computer, the CPU, is a device which really can't do very much. It's a self-contained unit which can do lots of interesting mathematical and logical operations, but they all happen completely inside the CPU (how they happen inside the CPU is way beyond this post!). To get stuff in and out of the CPU, the only thing that the computer can do is read and write values from the computer's memory. That's really it.

So how do I get a program in to the computer? The computer can only read the program if it's in the computer's memory. And every way that I can get it into the memory involves the CPU!

Computers are built so that there are certain memory locations and operations that are used to interact with the outside world. They also have signal wires called interrupt pins where other devices (like disk drives) can apply a current to say "Hey, I've got something for you". The exact mechanics are, of course, complicated, and vary from CPU to CPU. But to give you an idea of what it's like, to read some data from disk, you'd do something like the following.

  1. Set aside a chunk of memory where the data should be stored after it's read. This is called a buffer.
  2. Figure out where the data you want to read is stored on the disk. You can identify disk locations as a number. (It's usually a bit more complicated than that, but we're trying to keep this simple.
  3. Write that number into a special memory location that's monitored by the disk drive controller.
  4. Wait until the disk controller signals you via an interrupt that the data is ready. The data will be stored in a special memory location, that can be altered by the disk. (Simplifying again, but this is sometimes called a DMA buffer.)
  5. Copy the data from the controller's DMA buffer into the application's memory buffer that you allocated.

When you down to that level, programming is an intricate dance! No one
wants to do that - it's too complicated, too error prone, and just generally
too painful. But there's a deeper problem: at this level, it's every program
for itself. How do you decide where on the disk to put your data? How can you
make sure that no one else is going to use that part of the disk? How can you
tell another program where to find the data that you stored?

You want to have something that creates the illusion of a much simpler computational world. Of course, under the covers, it's all going to be that incredibly messy stuff, but you want to cover it up. That's the job of an operating system: it's a layer between the hardware and the programs that you run that create a new computational world that's much easier to work in.

Instead of having to do the dance of mucking with the hard disk drive controller yourself, the operating system gives you a way of saying "Open a file named 'foo'", and then it takes that request, figures out where 'foo' is on the disk, talks to the disk drive, gets the data, and then hands you a buffer containing it. You don't need to know what kind of disk drive the data is coming from, how the name 'foo' maps to sectors of the disk. You don't need to know where the control memory locations for the drive are. You just let the operating system do that for you.

So, ultimately, this is the answer: The operating system is a program that runs on the computer, and creates the environment in which other programs can run. It does a lot of things to create a pleasant environment in which to write and run other programs. Among the multitude of services provided by most modern operating system are:

  1. Device input and output. This is what we talked about above: direct interaction with input and output devices is complicated and error prone; the operating system implements the input and output processes once, (hopefully) without errors, and then makes it easy for every other program to just use its correct implementation.
  2. Multitasking: your computer has enough power to do many things at once. Most modern computers have more than one CPU. (My current laptop has 4!) And most programs end up spending a lot of their time doing nothing: waiting for you to press a key, or waiting for the disk drive to send it data. The operating system creates sandboxes, called processes, and allows one program to run in each sandbox. It takes care of ensuring that each process gets to run on a CPU for a fair share of the time.
  3. Memory management. With more than one program running at the same time on your computer, you need to make sure that you're using memory that isn't also being used by some other program, and to make sure that no other program can alter the memory that you're using without your permission. The operating system decides what parts of memory can be used by which program.
  4. Filesystems. Your disk drive is really just a huge collection of small sections, each of which can store a fixed number of bits, encoded in some strange format dictated by the mechanics of the drive. The OS provides an abstraction that's a lot easier to deal with.

I think that's enough for one day. Tomorrow: how the computer knows how to run the OS when it gets switched on!

6 responses so far

The Birthday Paradox

(by MarkCC) Nov 18 2013

To me, the thing that makes probability fun is that the results are frequently surprising. We've got very strong instincts about how we expect numbers to work. But when you do anything that involves a lot of computations with big numbers, our intuition goes out the window - nothing works the way we expect it to. A great example of that is something called the birthday paradox.

Suppose you've got a classroom full of people. What's the probability that there are two people with the same birthday? Intuitively, most people expect that it's pretty unlikely. It seems like it shouldn't be likely - 365 possible birthdays, and 20 or 30 people in a classroom? Very little chance, right?

Let's look at it, and figure out how to compute the probability.

Interesting probability problems are all about finding out how to put things together. You're looking at things where there are huge numbers of possible outcomes, and you want to determine the odds of a specific class of outcomes. Finding the solutions is all about figuring out how to structure the problem.

A great example of this is something called the birthday paradox. This is a problem with a somewhat surprising outcome. It's also a problem where finding the right way to structure the problem is has a dramatic result.

Here's the problem: you've got a group of 30 people. What's the probability that two people out of that group of thirty have the same birthday?

We'll look at it with some simplifying assumptions. We'll ignore leap year - so we've got 365 possible birthdays. We'll assume that all birthdays are equally likely - no variation for weekdays/weekends, no variation for seasons, and no holidays, etc. Just 365 equally probable days.

How big is the space? That is, how many different ways are there to assign birthdays to 30 people? It's 36530 or something in the vicinity of 7.4*1076.

To start off, we'll reverse the problem. It's easier to structure the problem if we try to ask "What's the probability that no two people share a birthday". If P(B) is the probability that no two people share a birthday, then 1-P(B) is the probability that at least two people share a birthday.

So let's look at a couple of easy cases. Suppose we've got two people? What's the odds that they've got the same birthday? 1 in 365: there are 3652 possible pairs of birthdays; there are 365 possible pairs. So there's a probability of 365/3652 that the two people have the same birthday. For just two people, it's pretty easy. In the reverse form, there's a 364/365 chance that the two people have different birthdays.

What about 3 people? It's the probability of the first two having different birthdays, and the probability of the third person having a different birthday that either of those first two. There are 365 possible birthdays for the third person, and 363 possible days that don't overlay with the first two. So for N people, the probability of having distinct birthdays is \(1 times (1 - 1/365) times (1 - 2/365) times dots (1 - (n/365))\).

At this point, we've got a nice recursive definition. Let's say that \(f(N)\) is the probability of \(N\) people having distinct birthdays. Then:

  1. For 2 people, the probability of distinct birthdays is 364/365. (\(f(2) = frac{364}{365}\))
  2. For N>2 people, the probability of distinct birthdays is
    \(frac{365-(N-1)}{365} times f(n-1)\).

Convert that to a closed form, and you get: \(f(n) = frac{365!}{(365-(n-1))!365^n}\). For 30 people, that's
\(frac{365!}{(365-29)!*365^{30}}\). Work it out, and that's
0.29 - so the probability of everyone having distinct
birthdays is 29% - which means that the probability of at least
two people in a group of 30 having the same birthday is 71%!

You can see why our intuitions are so bad? We're talking about something where one factor in the computation is the factorial of 365!

Let's look a bit further: how many people do you need to have, before there's a 50% chance of 2 people sharing a birthday? Use the formulae we wrote up above, and it turns out to be 23. Here's the numbers - remember that this is the reverse probability, the probability of all birthdays being distinct.

1 1
2 0.997260273973
3 0.991795834115
4 0.983644087533
5 0.9728644263
6 0.959537516351
7 0.943764296904
8 0.925664707648
9 0.905376166111
10 0.883051822289
11 0.858858621678
12 0.832975211162
13 0.805589724768
14 0.776897487995
15 0.747098680236
16 0.716395994747
17 0.684992334703
18 0.653088582128
19 0.620881473968
20 0.588561616419
21 0.556311664835
22 0.524304692337
23 0.492702765676
24 0.461655742085
25 0.431300296031
26 0.401759179864
27 0.373140717737
28 0.345538527658
29 0.319031462522
30 0.293683757281
31 0.269545366271
32 0.24665247215
33 0.225028145824
34 0.20468313538
35 0.185616761125
36 0.16781789362
37 0.151265991784
38 0.135932178918
39 0.121780335633
40 0.108768190182
41 0.0968483885183
42 0.0859695284381
43 0.0760771443439
44 0.0671146314486
45 0.0590241005342
46 0.0517471566327
47 0.0452255971667
48 0.0394020271206
49 0.0342203906773
50 0.029626420422

With just 23 people, there's a greater than 50% chance that two people will have the same birthday. By the time you get to just 50 people, there's a greater than 97% chance that two people have the same birthday!

As an amusing aside, the first time I saw this problem worked through was in an undergraduate discrete probability theory class, with 37 people in the class, and no duplicate birthdays!

Now - remember at the beginning, I said that the trick to working probability problems is all about how you formulate the problem. There's a much, much better way to formulate this.

Think of the assignment of birthdays as a function from people to birthdays: \(f: P rightarrow B\). The number of ways of assigning birthdays to people is the size of the set of functions from people to birthdays. How many possible functions are there? \(| B | ^{| P |}\). \(| B |\) is the number of days in the year - 365, and \(| P |\) is the number of people in the group.

The set of assignments to unique birthdays is the number of injective functions. (An injective function is a function where \(f(x) = f(y) Leftrightarrow x = y\).) How many injective functions are there? \(frac{| B |!}{(| B | - | P |)!}\).

The probability of all birthdays being unique is the size of the set of injective functions divided by the size of the set of all assignments: \(frac{frac{| B |!}{(| B | - | P |)!}}{ | B | ^{| P |} } = frac{365!}{365^Ptimes (365 - P)!}\).

So we've got the exact same result - but it's a whole lot easier in term of the discrete functions!

21 responses so far

The Elegance of Uncertainty

(by MarkCC) Nov 15 2013

I was recently reading yet another botched explanation of Heisenberg's uncertainty principle, and it ticked me off. It wasn't a particularly interesting one, so I'm not going disassemble it in detail. What it did was the usual crackpot quantum dance: Heisenberg said that quantum means observers affect the universe, therefore our thoughts can control the universe. Blah blah blah.

It's not worth getting into the cranky details. But it inspired me to actually take some time and try to explain what uncertainty really means. Heisenberg's uncertainty principle is fascinating. It's an extremely simple concept, and yet when you realize what it means, it's the most mind-blowingly strange thing that you've ever heard.

One of the beautiful things about it is that you can take the math of uncertainty and reduce it to one simple equation. It says that given any object or particle, the following equation is always true:

\[sigma_x sigma_p ge hbar\]

Where:

  • \(sigma_x\) is a measurement of the amount of uncertainty
    about the position of the particle;
  • \(sigma_p\) is the uncertainty about the momentum of the particle; and
  • \(hbar\) is a fundamental constant, called the reduced Plank's constant, which is roughly \(1.05457173 times 10^{-34}frac{m^2 kg}{s}\).

That last constant deserves a bit of extra explanation. Plank's constant describes the fundamental granularity of the universe. We perceive the world as being smooth. When we look at the distance between two objects, we can divide it in half, and in half again, and in half again. It seems like we should be able to do that forever. Mathematically we can, but physically we can't! Eventually, we get to a point where where is no way to subdivide distance anymore. We hit the grain-size of the universe. The same goes for time: we can look at what happens in a second, or a millisecond, or a nanosecond. But eventually, it gets down to a point where you can't divide time anymore! Planck's constant essentially defines that smallest unit of time or space.

Back to that beautiful equation: what uncertainty says is that the product of the uncertainty about the position of a particle and the uncertainty about the momentum of a particle must be at least a certain minimum.

Here's where people go wrong. They take that to mean that our ability to measure the position and momentum of a particle is uncertain - that the problem is in the process of measurement. But no: it's talking about a fundamental uncertainty. This is what makes it an incredibly crazy idea. It's not just talking about our inability to measure something: it's talking about the fundamental true uncertainty of the particle in the universe because of the quantum structure of the universe.

Let's talk about an example. Look out the window. See the sunlight? It's produced by fusion in the sun. But fusion should be impossible. Without uncertainty, the sun could not exist. We could not exist.

Why should it be impossible for fusion to happen in the sun? Because it's nowhere near dense or hot enough.

There are two forces that you need to consider in the process of nuclear fusion. There's the electromagnetic force, and there's the strong nuclear force.

The electromagnetic force, we're all familiar with. Like charges repel, different charges attract. The nucleus of an atom has a positive charge - so nuclei repel each other.

The nuclear force we're less familiar with. The protons in a nucleus repel each other - they've still got like charges! But there's another force - the strong nuclear force - that holds the nucleus together. The strong nuclear force is incredibly strong at extremely short distances, but it diminishes much, much faster than electromagnetism. So if you can get a proton close enough to the nucleus of an atom for the strong force to outweigh the electromagnetic, then that proton will stick to the nucleus, and you've got fusion!

The problem with fusion is that it takes a lot of energy to get two hydrogen nuclei close enough to each other for that strong force to kick in. In fact, it turns out that hydrogen nuclei in the sun are nowhere close to energetic enough to overcome the electromagnetic repulsion - not by multiple orders of magnitude!

But this is where uncertainty comes in to play. The core of the sun is a dense soup of other hydrogen atoms. They can't move around very much without the other atoms around them moving. That means that their momentum is very constrained - \(sigma_p\) is very small, because there's just not much possible variation in how fast it's moving. But the product of \(sigma_p\) and \(sigma_x\) have to be greater than \(hbar\), which means that \(sigma_x\) needs to be pretty large to compensate for the certainty about the momentum.

If \(sigma_x\) is large, that means that the particle's position is not very constrained at all. It's not just that we can't tell exactly where it is, but it's position is fundamentally fuzzy. It doesn't have a precise position!

That uncertainty about the position allows a strange thing to happen. The fuzziness of position of a hydrogen nucleus is large enough that it overlaps with the the nucleus of another atom - and bang, they fuse.

This is an insane idea. A hydrogen nucleus doesn't get pushed into a collision with another hydrogen nucleus. It randomly appears in a collided state, because it's position wasn't really fixed. The two nuclei that fused didn't move: they simply didn't have a precise position!

So where does this uncertainty come from? It's part of the hard-to-comprehend world of quantum physics. Particles aren't really particles. They're waves. But they're not really waves. They're particles. They're both, and they're neither. They're something in between, or they're both at the same time. But they're not the precise things that we think of. They're inherently fuzzy probabilistic things. That's the source uncertainty: at macroscopic scales, they behave as if they're particles. But they aren't really. So the properties that associate with particles just don't work. An electron doesn't have an exact position and velocity. It has a haze of probability space where it could be. The uncertainty equation describes that haze - the inherent uncertainty that's caused by the real particle/wave duality of the things we call particles.

29 responses so far

This one's for you, Larry! The Quadrature BLINK Kickstarter

(by MarkCC) Nov 14 2013

After yesterday's post about the return of vortex math, one of my coworkers tweeted the following at me:

Larry's a nice guy, even if he did give me grief at my new-hire orientation. So I decided to take a look. At oh my, what a treasure he found! It's a self-proclaimed genius with a wonderful theory of everything. And he's running a kickstarter campaign to raise money to publish it. So it's a lovely example of profound crackpottery, with a new variant of the buy my book gambit!

To be honest, I'm a bit uncertain about this. At times, it seems like the guy is dead serious; at other times, it seems like it's an elaborate prank. I'm going to pretend that it's completely serious, because that will make this post more fun.

So, what exactly is this theory of everything? I don't know for sure. He's dropping hints, but he's not going to tell us the details of the theory until enough people buy his book! But he's happy to give us some hints, starting with an explanation of what's wrong with physics, and why a guy with absolutely no background in physics or math is the right person to revolutionize physics! He'll explain it to us in nine brief points!

First: Let me ask you a question. Since the inclusion of Relativity and Dirac’s Statistical Model, why has Physics been at loose ends to unify the field? Everyone has tried and failed, and for this reason so many have pointed out: what we don’t need, is another TOE, Theory of Everything. So if I was a Physicist, my theory would probably just be one of these… a failed TOE based on the previous literature.

But why do these theories fail? One thing for sure is that in academia every new ideas stems from previously accepted ideas, with a little tweak here or there. In the main, TOEs in Physics have this in common, and they all have failed. What does this tell you?

See, those physicists, they're all just trying the same stuff, and they all failed, therefore they'll never succeed.

When I look at modern physics, I see some truly amazing things. To pull out one particularly prominent example from this year, we've got the higgs boson. He'll sneer at the higgs boson a bit later, but that was truly astonishing: decades ago, based on a deep understanding of the standard model of particle physics, a group of physicists worked out a theory of what mass was and how it worked. They used that to make a concrete prediction about how their theory could tested. It was untestable at the time, because the kind of equipment needed to perform the experiment didn't exist, and couldn't exist with current technology. 50 years later, after technology advanced, their prediction was confirmed.

That's pretty god-damnned amazing if you ask me.

Based on the arguments from our little friend, a decade ago, you could have waved your hands around, and said that physicists had tried to create theories about why things had mass, and they'd failed. Therefore, obviously, no theory of mass was going to come from physics, and if you wanted to understand the universe, you'd have to turn to non-physicists.

On to point two!

Second: the underlying assumptions in Physics must be wrong, or somehow grossly mis-specified.

That's it. That's the entire point. No attempt to actually support that argument. How do we know that the underlying assumptions in physics must be wrong? Because he says so. Period.

Third: Who can challenge the old paradigm of Physics, only Copernicus? Physicists these days cannot because they are too inured of their own system of beliefs and methodologies. Once a PhD is set in place, Lateral Thinking, or “thinking outside the box,” becomes almost impossible due to departmental “silo thinking.” Not that physicists aren’t smart – some are genius, but like everyone in the academic world they are focused on publishing, getting research grants, teaching and other administrative duties. This leaves little time for creative thinking, most of that went into the PhD. And a PhD will not be accepted unless a candidate is ready and willing to fall down the “departmental silo.” This has a name: Catch 22.

It's the "good old boys" argument. See, all those physicists are just doing what their advisors tell them to; once they've got their PhD, they're just producing more PhDs, enforcing the same bogus rules that their advisors inflicted on them. Not a single physicist in the entire world is willing to buck this! Not one single physicist in the world is willing to take the chance of going down as one of the greatest scientific minds in history by bucking the conventional wisdom.

Except, of course, there are plenty of people doing that. For an example, right off the top of my head, we've got the string theorists. Sure, they get lots of justifiable criticism. But they've worked out a theory that does seem to describe many things about the universe. It's not testable with present technology, and it's not clear that it will ever be testable with any kind of technology. But according to Bretholt's argument, the string theorists shouldn't exist. They're bucking the conventional model, and they're getting absolutely hammered for it by many of their colleagues - but they're still going ahead and working on it, because they believe that they're on to something important.

Fourth: There is not much new theory-making going on in Physics since its practitioners believe their Standard Model is almost complete: just a few more billion dollars in research and all the colors of the Higgs God Particle may be sorted, and possibly we may even glimpse the Higgs Field itself. But this is sort of like hunting down terrorists: if you are in control of defining what a terrorist is, then you will never be out of a job or be without a budget. This has a name too: Self-Fulfilling Prophesy. The brutal truth…

Right, there's not much new theory-making going on in physics. No one is working on string theory. There's no one coming up with theories about dark matter or dark energy. There's no one trying to develop a theory of quantum gravity. No one ever does any of this stuff, because there's no new theory-making going on.

Of course, he hand-waves one of the most fantastic theory-confirmations from physics. The higgs got lots of press, and lots of people like to hand-wave about it and overstate what it means. ("It's the god particle!") But even stripped down to its bare minimum, it's an incredible discovery, and for a jackass like this to wave his hands and pretend that it's meaningless and we need to stop wasting time on stuff like the LHC and listen to him: I just don't even know the right words to describe the kind of disgust it inspires in me.

Fifth: Who then can mount such a paradigm-breaking project? Someone like me, prey tell! But birds like me just don’t sit around the cage and get fat, we fly to the highest vantage point, and see things for what they are! We have a name as well: Free Thinkers. We are exactly what your mother warned you of… There’s a long list of us include Socrates, Christ, Buddha, Taoist Masters, Tibetan Masters, Mohammed, Copernicus, Newton, Maxwell, Gödel, Hesse, Jung, Tesla, Planck… All are Free Thinkers, confident enough in their own knowledge and wisdom that they are willing to risk upsetting the applecart! We soar so humanity can peer beyond its petty day to day and discover itself.

There's two things that really annoy me about this paragraph. First of all, there's the arrogance. This schmuck hasn't done anything yet, but he sees fit to announce that he's up there with Newton, Maxwell, etc.

Second, there's the mushing together of scientists and religious figures. Look, I'm a religious jew. I don't have anything against respecting theology, theologians, or religious authorities. But science is different. Religion is about subjective experience. Even if you believe profoundly in, say, Buddhism, you can't just go through the motions of what Buddha supposedly did and get exactly the same result. There's no objective, repeatable way of testing it. Science is all about the hard work of repeatable, objective experimentation.

He continues point 5:

This chain might have included Einstein and Dirac had they not made three fatal mistakes in Free Thinking: They let their mathematical machine dictate what was true rather than using mathematics only to confirm their observations, they got fooled by their own anthropomorphic assumptions, and then they rooted these assumptions into their mathematical methods. This derailed the last two generations of scientific thinking.

Here's where he strays into the real territory of this blog.

Crackpots love to rag on mathematics. They can't understand it, and they want to believe that they're the real geniuses, so the math must be there to confuse things!

Scientists don't use math to be obscure. Learning math to do science isn't some sort of hazing ritual. The use of math isn't about making science impenetrable to people who aren't part of the club. Math is there because it's essential. Math gives precision to science.

Back to the Higgs boson for a second. The people who proposed the Higgs didn't just say "There's a field that gives things mass". They described what the field was, how they thought it worked, how it interacted with the rest of physics. The only way to do that is with math. Natural language is both too imprecise, and too verbose to be useful for the critical details of scientific theories.

Let me give one example from my own field. When I was in grad school, there was a new system of computer network communication protocols under design, called OSI. OSI was complex, but it had a beauty to its complexity. It carefully divided the way that computer networks and the applications that run on them work into seven layers. Each layer only needed to depend on the details of the layer beneath it. When you contrast it against TCP/IP, it was remarkable. TCP/IP, the protocol that we still use today, is remarkably ad-hoc, and downright sloppy at times.

But we're still using TCP/IP today. Why?

Because OSI was specified in english. After years of specification, several companies and universities implemented OSI network stacks. When they connected them together, what happened? It didn't work. No two of the reference implementations could talk to each other. Each of them was perfectly conformant with the specification. But the specification was imprecise. To a human reader, it seemed precise. Hell, I read some of those specifications (I worked on a specification system, and read all of specs for layers 3 and 4), and I was absolutely convinced that they were precise. But english isn't a good language for precision. It turned out that what we all believed was perfectly precise specification actually had numerous gaps.

There's still a lot of debate about why the OSI effort failed so badly. My take, having been in the thick of it is that this was the root cause: after all the work of building the reference implementations, they realized that their specifications needed to go back to the drawing board, and get the ambiguities fixed - and the world outside of the OSI community wasn't willing to wait. TCP/IP, for all of its flaws, had a perfectly precise specification: the one, single, official reference implementation. It might have been ugly code, it might have been painful to try to figure out what it meant - but it was absolutely precise: whatever that code did was right.

That's the point of math in science: it gives you that kind of unambiguous precision. Without precision, there's no point to science.

Sixth: What happens to Relativity when the assumptions of Lorentz’ space-time is removed? Under these assumptions, the speed of light limits the speed of moving bodies. The Lorentz Transformation was designed specifically to set this speed limit, but there is no factual evidence to back it up. At first, the transformation assumed that there would be length and time dilations and a weight increase when travelling at sub-light speeds. But after the First Misguided Generation ended in the mid 70’s, the weight change idea was discarded as untenable. It was quietly removed because it implied that a body propagating at or near the speed of light would become infinitely massive and turn into a black hole. Thus, the body would swallow itself up and disappear!

Whoops… bad assumption!

The space contraction idea was left intact because it was imperative to Hilbert’s rendition of the space-time geodesic that he devised for Einstein in 1915. Hilbert was the best mathematician of his day, if not ever! He concocted the mathematical behemoth called General Relativity to encapsulate Einstein's famous insight that gravitation was equivalent to an accelerating frame. Now, not only was length assumed to contract, but space was assumed to warp and gravitation was assumed to be an accelerating frame, though no factual evidence exists to back up these assumptions!

Whoops… 3 bad assumptions in a row!

This is an interestingly bizarre argument.

Relativity predicts a change in mass (not weight!) as velocity increases. That prediction has not changed. It has been confirmed, repeatedly, by numerous experiments. The entire reasoning here is based on the unsupported assertion that relativistic changes in mass have been discarded as incorrect. But that couldn't be farther from the truth!

Similarly, he's asserting that the space-warping effects of gravity - one of the fundamental parts of general relativity - is incorrect, again without the slightest support.

This is going to seem like a side-track, but bear with me:

When I came in to my office this morning, I took out my phone and used foursquare to check in. How did that work? Well, my phone received signals from a collection of satellites, and based on the tiny differences in data contained in those signals, it was able to pinpoint my location to precisely the corner of 43 street and Madison avenue, outside of Grand Central Terminal in Manhattan.

To be able to pinpoint my location that precisely, it ultimately relies on clocks in the satellites. Those clocks are in orbit, moving very rapidly, and in a different position in earths gravity well. Space-time is less warped at their elevation than it is here on earth. Relativity predicts that based on that fact, the clocks in those satellites must move at a different rate than clocks here on earth. In order to get precise positions, those clocks need to be adjusted to keep time with the receivers on the surface of the earth.

If relativity - with its interconnected predictions of changes in mass, time, and the warp of space-time - didn't work, then the corrections made by the GPS satellites wouldn't be needed. And yet, they are.

There are numerous other examples of this. We've observed relativistic effects in many different ways, in many different experiments. Despite what Mr. Bretholt asserts, none of this has been disproven or discarded.

Seventh: Many, many, many scientists disagree with Relativity for these reasons and others, but Physics keeps it as a mainstream idea. It has been violated over and over again in various space programs, and is rarely used in the aerospace industry when serious results are expected. Physics would like to correct Relativity because it doesn’t jive with the Quantum Standard Model, but they can’t conceive how to fix it.

In Quadrature Theory the problem with Relativity is obvious and easily solved. The problem is that the origin and nature of space is not known, nor is the origin and nature of time or gravitation. Einstein did not prove anything about gravitation, norhas anyone since. The “accelerating frame” conjecture is for the convenience of mathematics and sheds no light on the nature of gravitation itself. Quantum Chromo Dynamics, QCD, hypothesizes the “graviton” on the basis of similarly convenient mathematics. Many scientists disagree with such “force carrier” propositions: they are all but silenced by the trends in Physics publishing, however. The “graviton” is, nevertheless, a mathematical fiction similar to Higgs Boson.

Whoops… a couple more bad assumptions, but where did they come from?

Are there any serious scientists who disagree with relativity? Mr. Bretholt doesn't actually name any. I can't think of any credible ones. Certainly pretty much all physicists agree that there's a problem because both relativity and quantum physics both appear to be correct, but they're not really compatible. It's a major area of research. But that's a different thing from saying that scientists "disagree" with or reject relativity. Relativity has passed every experimental test that anyone has been able to devise.

Of course, it's completely true that Einstein didn't prove anything about gravity. Science doesn't deal with proof. Science devises models based on observations. It tries to find the best predictive model of the universe that it can, based on repeated observation. Science can disprove things, by showing that they don't match our observations of reality, but it can't prove that a theory is correct. So we can never be sure that our model is correct - just that it does a good job of making predictions that match further observations. Relativity could be completely, entirely, 100% wrong. But given everything we know now, it's the best predictive theory we have, and nothing we've been able to do can disprove it.

Ok, I've gone on long enough. If you want to see his last couple of points, go ahead and follow the link to his "article". After all of this, we still haven't gotten to anything about what his supposed new theory actually says, and I want to get to just a little bit of that. He's not telling us much - he wants money to print his book! - but what little he says is on his kickstarter page.

So let me introduce that modification: it’s called Quadrature, or Q. Quadrature arose from Awareness as the original separation of Awareness from itself. This may sound strangely familiar; I elaborate at length about it in BLINK. The Theory of Quadrature develops Q as the Central Generating Principle that creates the Universe step by step. After a total of 12 applications of Quadrature, it folds back on itself like a snake biting its tail. Due to this inevitable closure, the Universe is complete, replete with life, energy and matter, both dark and light. As a necessary consequence of this single Generating Principle, everything in the Universe is ultimately connected through ascending levels of Awareness.

The majesty and mystery of Awareness and its manifestation remains, but this vision puts us inside as co-creative participants. I think you will agree that this is highly desirable from a metaphysical point of view. Quadrature is the mechanism that science has been looking for to unify these two points of view. Q has been foreshadowed in many ways in both physics and metaphysics. As developed in BLINK, Quadrature Theory can serve as a Theory of Everything.

Pretty typical grandiose crackpottery. This looks an awful lot like a variation of Langan's CTMU. It's all about awareness! And there's a simple "mathematical" construct called "quadrature" that makes it all work. Of course, I can't tell you what quadrature is. No, you need to pay me! Give me money! And then I'll deign to explain it to you.

To make a long story short, Quadrature Theory supports four essential claims that undermine Relativity, Quantum Mechanics, and Cosmology while placing these disciplines back on a more secure foundation once their erroneous assumptions have been removed. These are:

  1. The origin of space and its nature arise from Quadrature. Space is shown to be strictly rectilinear; space cannot warp under any conditions.
  2. The origin of the Tempic Field and its nature arise from Quadrature. This field facilitates all types of energetic interaction and varies throughout space. The idea of time arises solely from transactions underwritten by the Tempic Field. Therefore, time as we know it here on Earth is a local anomaly, which uniquely affects all interactions including the speed of light. “C,” in fact, is a velocity, and is variable in both speed and direction depending on the gradient of the Tempic Field. Thus, “C” varies drastically off-planet!
  3. Spin is a fundamental operation in space that constitutes the only absolute measurement. Its density throughout space is non-linear and it generates a variable Tempic Field within spinning systems such as atoms, or galaxies. This built-in “time” serves to hold the atom together eternally, and has many other consequences for Quantum Mechanics and Cosmology.
  4. Gravity is also a ringer in physics. Nothing of the fundamental origin of gravity is known, though we know how to use it quite well. Given the consequence of Spin, gravity can be traced to forms that have closed Tempic Fields. The skew electric component of spinning systems will align to create an aggregated, polarized, directional field: gravity.

Pop science, of course, loves to talk about black holes, worm holes, time warps and all manner of the ridiculous in physics. There is much more fascinating stuff than this in my book, and it is completely consistent with what is observable in the Universe. For example, I propose the actual purpose of the black hole and why every galaxy has one. At any rate, perhaps you now have an inkling of why Quadrature Theory is a Revolution Waiting to Happen!

Pure babble, stringing together words in nonsensical ways. As my mantra goes: the worst math is no math. Here he's arguing that rigorous, well-tested mathematical models are incorrect - because vague reasons.

14 responses so far

Vortex Math Returns!

(by MarkCC) Nov 12 2013

Cranks never give up. That's something that I've learned in my time writing this blog. It doesn't matter how stupid an idea is. It doesn't matter how obviously wrong, how profoundly ridiculous. No matter what, cranks will continue to push their ridiculous ideas.

One way that this manifests is the comments on old posts never quite die. Years after I initially write a post, I still have people coming back and trying to share "new evidence" for their crankery. George Shollenberger, the hydrino cranks, the Brown's gas cranks, the CTMU cranks, they've all come back years after a post with more of the same-old, same-old. Most of the time, I just ignore it. There's nothing to be gained in just rehashing the same old nonsense. It's certainly not going to convince the cranks, and it's not going to be interesting to my less insane readers. But every once in a while, something comes along in those comments, something that's actually new and amusing comes along. Today I've got an example of that for you: one of the proponents of Markus Rodin's "Vortex Math" has returned to tell us the great news!

I have linked Vortex Based Mathematics with Physics and can prove most physics using vortex based mathematics. I am writing an article call "Temporal Physics of Vortex Based Mathematics" here: http://www.vortexspace.org

This is a lovely thing, even without needing to actually look at his article. Just start at the very first line! He claims that he can "prove most of physics".

Science doesn't do proof.

What science does is make observations, and then based on those observations produce models of the universe. Then, using that model, it makes predictions, and compares those predictions with further observations. By doing that over and over again, we get better and better models of how the universe works. Science is never sure about anything - because all it can do is check how well the model works. It's always possible that any model doesn't describe how things actually work. But it gives us a good approximation, in a way that allows us to understand how things work. Or, not quite how things work, but how we can affect the world by our actions. Our model might not capture what's really happening - but it's got predictive power.

To give an example of this: our model of the universe says that the earth orbits the sun, which is orbits the galactic core, which is moving through the universe. It's possible that this is wrong. You can propose an alternative model in which the earth is the stationary center of the universe, and everything moves around it. As a model, it's not very attractive, because to make it fit our observations, it requires a huge amount of complexity - it's a far, far more complex model than our standard one, and it's much harder to use to make accurate predictions. But it can be made to work, just as well as our standard one. It's possible that that's how the universe actually works. I don't think any reasonable person actually believes that the universe works that way, but it's possible that our entire model is wrong. Science can't prove that our model is correct. It can just show that it's the simplest model that matches our observations.

But Mr. Calhoun claims that he can prove physics. That claim shows that he has no idea of what science is, or what science means. And if he doesn't understand something that simple, why should we trust him to understand any more?

Ah, but when we take a look at some of his writings... it's a lovely pile of rubbish. Remember the mantra of this blog? The worst math is no math. Mr. Calhoun's writing is a splendid example of this. He claims to be doing science, math, and mathematical proofs - but when you actually look at his writing, there's not a spec of genuine math to be found!

Let's start with a really quick reminder of what vortex math is. Take the sequence of doubling in natural numbers in base-10. 1, 2, 4, 8, 16, 32, 64, 128, 256, 512, 1024, .... If, for each of those numbers, you sum the digits until you get a single digit result, you get: 1, 2, 4, 8, 7, 5, 1, 2, 4, 8, 7, 5, ... It turns into a repeated sequence, 1, 2, 4, 8, 7, 5, over and over again. You can do the same thing in the reverse direction, by halving: 1, 0.5, 0.25, 0.125, 0.0625, 0.03125, 0.015625, 0.0078125, where the digits sum to 1, 5, 7, 8, 4, 2, 1, 5, ...

According to Rodin, this demonstrates something profound. This is the heart of Vortex mathematics: this cycle in the numbers shows that there's some kind of energy flow that is fundamental to the universe, based on this kind of repeating sequence.

So, how does Mr. Calhoun use this? He thinks that he can connect it to black holes and white holes:

Do not forget that we already learned that black holes suck in matter while "compressing" it; and, on the other side of the black hole is a white hole that then takes the same matter and spits it back out while "de-compressing" the matter. The "magnetic warp" video on Youtube shows the same torus shape Marko had illustrated in his "vortex based mathematics" video [see below]:

You can clearly see the vortex in the center of the torus magnets. This is made possible using two Ferrofluid Hele-Shaw Cells [Hele-Shaw effect]. Here are a few links about using ferrofluid hele-shaw cell to view magnetic fields:

http://en.wikipedia.org/wiki/Hele-Shaw_flow

http://www2.warwick.ac.uk/fac/cross_fac/iatl/ejournal/issues/volume2issue1/snyder/

Here is a quote from a Youtube user about the magnets:

"Walter Rawls, a? scientist who did a great deal of research with Albert Roy Davis, said that he believes at the center of every magnet there is a miniature black hole."

I have not verified the above statement about Walter Rawls as of yet. However, the above images prove beyond doubt Marko's torus universe mathematical geometry. Now lets take a look at Marko's designs:

The pictures look kind-of-like this silly torus thing that Rodin likes to draw: therefore they prove beyond doubt that Rodin's rubbish is correct! Wow, now that's a mathematical proof!

It gets worse from there.

The next section is "The Physics of Time".

If you looked at the Youtube videos of the true motion of the Earth through space you now know that we are literally falling into a black hole that is at the center of the galaxy. The motion of the Earth; all of the rotation and revolution, all of that together is caused by space-time. Time is acually the rate and pattern of the motion of matter as it moves through space. It is the fourth dimension. you have probably heard this if you have studied Einstien theories: "As an object moves faster the rate of its motion [or time] slows down". Sounds like an oxymoron doesn't it? Well it not so strange once you understand how the fabric of space-time relates to Vortex Based Mathematics.

Motion of the Earth

The planet Earth rotates approx every twenty-four hours. It makes a complete 360o rotation every twenty-four hours. That amount of time is the frequency of the rate of rotation.

Looking down from the north pole of the Earth, you will see that if we divide the sphere into 36 equal parts the sunrise would have to pass through all of the degrees of the sphere in order to make a complete cycle:

Remember the Earth is a "giant magnet" that is spinning. The electromagnetic field of this "giant magnet" is moving out of the north pole [which is really at the geographic south pole] and going to the south pole [which again is really at the geographic north pole]. This electromagnetic field is moving or spinning [see youtube video at top] according to a frequency or cycle.

I don't know if you realize this, but matter can be compressed or expanded without it being destroyed. A black hole does not de-molecularize matter then in passing to the white hole reassemble it again. Nothing that is demolecularized can naturally be put back together again. If an object is destroyed then is it destroyed; there is no reassembly. Matter can be however, compressed and decompressed. As you probably know and have heard this before there is an huge amount of distance between the atoms in your body. Like the giant void of space and much like the distances between planets in our solar system; the atomic matter in our bodies is just as similar in the amount of space between each atom.

What fills the spaces between each atom? Well, Its space-time. It is the fabric of the inertia ether that all matter in space moves through. Spacetime or what I call "etherspace" is what I have come to realize as "the space in between the spaces". This "etherspace" can be compressed and then decompressed. Etherspace can enable all of the matter in your body to be greatly compressed without your body being destroyed; and at the same time functioning as it normally should. The ether space then allows your body to be decompressed again; all the while functioning as it should.

It is the movement of spacetime or "ether space" that is causing the rotation and revolving of the planet we live on. It is also responsible for the motions of all of the bodies in space.

Magnets will, whether great or small, act as engines for etherspace. They pull in etherspace at the south pole and also pump out etherspace at the north pole of the magnet. All magnets do this; the great planet earth all the way to the little magnet that sticks to your refridgerator door. Vortex based mathematics prove all of this. I will show you.

As I stated earlier the Earth is a giant magnet and if we apply the Vortex Based Mathematics to the 10o degree spacings of this "giant magnet" lets see what happens. Now we are going to see the de-compression of space-time eminatiing from the true north pole of the giant magnet of the Earth. Let's deploy a doubling circuit to the spacings of the planet. We will start at 0o and go all the way to 360o .

Calhoun certainly shows that he's a worthy inheritor of the mantle of Rodin. Rodin's entire rubbish is really based on taking a fun property of our particular base-10 numerical notation, and without any good reason, believing that it must be a profound fundamental property of the universe. Calhoun takes two arbitrary things: the 360 degree conventional angle measurement, and the 24 hour day, and likewise, without any good reason, without even any argument, believes that they are fundamental properties of the universe.

Where does the 24 hour day come from? I did a bit of research, and there are a couple of possible arguments. It appears to date back to the old empire of Egypt. The argument that I found most convincing is based on how the Egyptians counted on their hands. They did a lot of things in base-12, because using your thumb to point out the joints of the fingers on your hand, you can count to 12. The origin of our base-10 is based on using fingers to count; base-12 is similar, but based on a slightly different way of counting on your fingers. Using base-12, they decided to describe time in terms of counting periods of light and darkness: 12 bright periods, 12 dark ones. There's nothing scientific or fundamental about it: it's an arbitrary way of measuring time. The Greeks adopted it from the Egyptians; the Romans adopted it from the Greeks; and we adopted it from the Romans. There is no fundamental reason why it is the one true correct way of measuring time.

Similarly, the 360 degree system of angular measure is not the least bit fundamental. It dates back to the Babylonians. In writing, the Babylonions used a base-60 system, instead of our base-10. In their explorations of geometry, they observed that if you inscribed a hexagon inside of a circle, each of the segments of the hexagon was the same length as the radius of the circle. So they measured an angle in terms of which segment of the inscribed hexagon it crossed. Within those sig segments, they divided them into sixty sections, because what else would people who use base-60 use? And then to subdivide those, they used 60 again. The 360 degree system is a random historical accident, not a profound truth.

I don't want to get too far off track (or too farther off track), but: In fact, when you're talking about angles, there is a fundamental measurement, called a radian. Whenever you do math using angles, you end up needing to introduce a conversion factor which converts your angle into radians.

Anyway - this rubbish about the 24 hour day and 360 degree circle are what passes for math in Calhoun's world. This is as close to math or to correctness that Calhoun gets.

What's even worse is his babble about black holes and white holes.

Both black and white holes are theoretical predictions of relativity. The math involved is not simple: it's based on Einstein's field equations from general relativity:

\[ R_{munu} - frac{1}{2}g_{munu}R + g_{mueta}Lambda = frac{8pi G}{c^4}T_{munu}\]

In this equation, the subscripted variables are all symmetric 4x4 tensors. Black and white holes are "solutions" to particular configurations of those tensors. This is not elementary math, not by a long-shot. But if you want to really talk about black and white holes, this is how you do it.

Translating from the math into prose is always a problem, because the prose is far less precise, and it's inevitably misleading. No matter how well you think you understand based on the prose, you don't understand the concept, because you haven't been told enough, in a precise enough way, to actually understand it.

That said, the closest I can come is the following.

We'll start with black holes. Black holes are much easier to understand: put enough mass into a small enough area of space, and you wind up with a boundary line, called the event horizon, where anything that crosses that boundary, no matter what - even massless stuff like light - can never escape. We believe, based on careful analysis, that we've observed black holes in our universe. (Or rather, we've seen evidence that they exist; you can't actually see a black hole; but you can see its effects.) We call a black hole a singularity, because nothing beyond the event horizon is visible - it looks like a hole in space. But it isn't: it's got a mass, which we can measure. Matter goes in to a black hole, and crosses the event horizon. We can no longer see the matter. We can't observe what happens to it once it crosses the horizon. But we know it's still there, because we can observe the mass of the hole, and it increases as matter enters.

(It was pointed out to me on twitter that my explanation of the singularity is wrong. See what happens when you try to explain mathematical stuff non-mathematically?)

White holes are a much harder idea. We've never seen one. In fact, we don't really think that they can exist in our universe. In concept, they're the opposite of a black hole: they are a region with a boundary than nothing can ever cross. In a black hole, you can't cross the boundary an escape; in a white hole, once something crosses the boundary, it can't ever re-enter. White holes only exist in a strange conceptual case, called an eternal black hole - that is, a black hole that has been there forever, which was never formed by gravitational collapse.

There are some folks who've written speculative work based on the solutions to the white hole field equations that suggest that our universe is the result of a white hole, inside of the event horizon of a black hole in an enclosing universe. But in this solution, the white hole exists for an infinitely small period of time: all of the matter in it ejects into a new space-time realm in an instant. There's no actual evidence for this, beyond the fact that it's an interesting way of interpreting a solution to the field equations.

All of this is a long-winded way of saying that when it comes to black holes, Calhoun is talking out his ass. A black hole is not one end of a tunnel that leads to a white hole. If you actually do the math, that doesn't work. A black hole does not "compress" matter and pass it to a white hole which decompresses it. A black hole is just a huge clump of very dense matter; when something crosses the event horizon of a black hole, it just becomes part of that clump of matter.

His babble about magnetism is similar: we've got some very elegant field equations, called Maxwell's equations, which describe how magnetism and electric fields work. It's beautiful, if complex, mathematics. And they most definitely do not describe a magnet as something that "pumps eitherspace from the south pole to the north pole".

There's no proof here. And there's no math here. There's nothing here but the midnight pot-fueled ramblings of a not particularly bright sci-fi fan, who took some wonderful stories, and believed that they were based on something true.

18 responses so far

Basic Data Structures: Hash Tables

(by MarkCC) Oct 20 2013

I'm in the mood for a couple of basics posts. As long-time readers might know, I love writing about data structures.

One of the most important and fundamental structures is a hashtable. In fact, in a lot of modern programming languages have left hashtables behind, for reasons I'll discuss later. But if you want to understand data structures and algorithmic complexity, hashtables are one of the essentials.

A hashtable a structure for keeping a list of (key, value) pairs, where you can look up a value using the key that's associated with it. This kind of structure is frequently called either a map, an associative array, or a dictionary.

For an example, think of a phonebook. You've got a collection of pairs (name, phone-number) that make up the phonebook. When you use the phonebook, what you do is look for a person's name, and then use it to get their phone number.

A hashtable is one specific kind of structure that does this. I like to describe data structures in terms of some sort of schema: what are the basic operations that the structure supports, and what performance characteristics does it have for those operations.

In those schematic terms, a hashtable is very simple. It's a structure that maintains a mapping from keys to values. A hashtable really only needs two operations: put and get:

  1. put(key, value): add a mapping from key to value to the table. If there's already a mapping for the key, then replace it.
  2. get(key): get the value associated with the key.

In a hashtable, both of those operations are extremely fast.

Let's think for a moment about the basic idea of a key-value map, and what kind of performance we could get out of a cople of simple naive ways of implementing it.

We've got a list of names and phone numbers. We want to know how long it'll take to find a particular name. How quickly can we do it?

How long does that take, naively? It depends on how many keys and values there are, and what properties the keys have that we can take advantage of.

In the worst case, there's nothing to help us: the only thing we can do is take the key we're looking for, and compare it to every single key. If we have 10 keys, then on average, we'll need to do an average of about 5 steps before we find the key we're looking for. If there are 100 keys, then it'll take, on average, about 50 steps. If there are one million keys, then it'll take an average of half a million steps before we can find the value.

If the keys are ordered - that is, if we can compare one key to another not just for equality, but we can ask which came first using a "less than or equal to" operator, then we can use binary search. With binary search, we can find an entry in a list of 10 elements in 4 steps. We can find an entry in a list of 1000 keys in 10 steps, or one in a list of one million keys in 20 steps.

With a hashtable, if things work right, in a table of 10 keys, it takes one step to find the key. 100 keys? 1 step. 1000 keys? 1 step. 1,000,000,000 keys? Still one step. That's the point of a hashtable. It might be really hard to do something like general a list of all of the keys - but if all you want to do is look things up, it's lightning.

How can it do that? It's a fairly simple trick: the hashtable trades space for time. A hashtable, under normal circumstances, uses a lot more space than most other ways of building a dictionary. It makes itself fast by using extra space in a clever way.

A hashtable creates a bunch of containers for (key, value) pairs called buckets. It creates many more buckets than the number of (key, value) pairs than it expects to store. When you want to insert a value into the table, it uses a special kind of function called a hash function on the key to decide which bucket to put the (key, value) into. When you want to look for the value associated with a key, it again uses the hash function on the key to find out which bucket to look in.

It's easiest to understand by looking at some actual code. Here's a simple, not at all realistic implementation of a hashtable in Python:

  class Hashtable(object):
    def __init__(self, hashfun, size):
      self._size = size
      self._hashfun = hashfun
      self._table = [[] for i in range(size)]

    def hash(self, key):
      return self._hashfun(key) % self._size

    def get(self, key, value):
      self._table[self.hash(key)].append((key, value))

    def get(self, key):
      for k,v in self._table[self.hash(key)]:
        if k == key:
          return v
      return None

If you've got a good hash function, and your hashtable is big enough, then each bucket will end up with no more than one value in it. So if you need to insert a value, you find an (empty) bucket using its hashcode, and dump it in: one step. If you need to find a value given its key, find the bucket using its hashcode, and return the value.

There are two big problems with hashtables.

First, everything is dependent on the quality of your hash function. If you hash function maps a lot of values to the same bucket, then your performance is going to suck. In fact, in the worst case, it's no better than just searching a randomly ordered list. Most of the time, you can come up with a hash function that does pretty good - but it's a surprisingly tricky thing to get right.

Second, the table really needs to be big relative to the number of elements that you expect to have in the list. If you set up a hashtable with 40 buckets, and you end up with 80 values stored in it, your performance isn't going to be very good. (In fact, it'll be slightly worse that just using a binary search tree.)

So what makes a good hash function? There are a bunch of things to consider:

  1. The hash function must be deterministic: calling the hash on the same key value must always produce the same result. If you're writing a python program like the one I used as an example above, and you use the value of the key objects fields to compute the hash, then changing the key objects fields will change the hashcode!
  2. The hash function needs to focus on the parts of the key that distinguish between different keys, not on their similarities. To give a simple example, in some versions of Java, the default hash function for objects is based on the address of the object in memory. All objects are stored in locations whose address is divisible by 4 - so the last two bits are always zero. If you did something simple like just take the address modulo the table size, then all of the buckets whose numbers weren't divisible by four would always be empty. That would be bad.
  3. The hash function needs to be uniform. That means that it needs to map roughly the same number of input values to each possible output value. To give you a sense of how important this is: I ran a test using 3125 randomly generated strings, using one really stupid hash function (xoring together the characters), and one really good one (djb2). I set up a small table, with 31 buckets, and inserted all of the value into it. With the xor hash function, there were several empty buckets, and the worst bucket had 625 values in it. With djb2, there were no empty buckets; the smallest bucket had 98 values, and the biggest one had 106.

So what's a good hash function look like? Djb2, which I used in my test above, is based on integer arithmetic using the string values. It's an interesting case, because no one is really entirely sure of exactly why it works better than similar functions, but be that as it may, we know that in practice, it works really well. It was invented by a guy named Dan Bernstein, who used to be a genius poster in comp.lang.c, when that was a big deal. Here's djb2 in Python:

def djb2(key):
  hash = 5381
  for c in key:
    hash = (hash * 33) + ord(c)
  return hash

What the heck is it doing? Why 5381? Because it's prime, and it works pretty well. Why 33? No clue.

Towards the beginning of this post, I alluded to the fact that hashtables have, at least to some degree, fallen out of vogue. (For example, in the Go language standard library, the map type is implemented using a red-black tree.) Why?

In practice, it's rarely any faster to really use a hashtable than to use a balanced binary tree like a red-black tree. Balanced trees have better worst-case bounds, and they're not as sensitive to the properties of the hash function. And they make it really easy to iterate over all of the keys in a collection in a predictable order, which makes them great for debugging purposes.

Of course, hash tables still get used, constantly. The most commonly used data structures in Java code include, without a doubt, the HashMap and HashSet, which are both built on hashtables. They're used constantly. You usually don't have to implement them yourself; and usually system libraries provide a good default hash function for strings, so you're usually safe.

There's also a lot of really fascinating research into designing ideal hash functions for various applications. If you know what your data will look like in advance, you can even build something called a perfect hash function, which guarantees no collisions. But that's a subject for another time.

19 responses so far

A Note to the Trolls Re: Comment Policies

(by MarkCC) Oct 17 2013

Since yesterday's post, I've been deluged with trolls who want to post comments about their views of sexual harassment. I've been deleting them as they come in, and that has, in turn, led to lots of complaints about how horrible unfair and mean I am.

I've been doing this blogging thing for a long time, and I've watched as a number of sites that I used to really enjoy have wound up becoming worthless, due to comment trolls. There are tons of trolls out there, and they're more than happy to devote a lot of time and energy to trolling and derailing. When I started my blog, I had a very open commenting policy: I rarely if ever deleted comments, and only did so when they were explicitly abusive towards other commenters. Since then, I've learned that in the current internet culture, that doesn't work. The only way to maintain a halfway decent comment forum is to moderate aggressively. So I've become much more aggressive about removing the stuff that I believe to be trolling.

Here's the problem: Trolls aren't interested in real discussions. They're interested in derailing discussions that they don't like. I'm not interested in hosting flame wars, misogynistic rants, or other forms of trolling. In case you haven't noticed, this is my blog. I'll do what I feel is appropriate to maintain a non-abusive, non-troll-infested comment section. I am under no obligation to post your rants, and I am under no obligation to provide you with a list of bullet points of what my exact standards are. If I judge a comment to be inappropriate, I'll delete it. If don't like that, you're welcome to find another forum, or create your own. It's a big internet out there: there's bound to be a place where your arguments are welcome. But that's not this place. If I'm over-aggressive in my moderation, the only one who'll be hurt by that will be me, because I will have wrecked the comment forum on my blog. That's a risk I'm prepared to take.

Let me add one additional comment about the particular trolls who've been coming to visit lately: I've learned, over time, a thing or two about the demographics of the people who visit this blog. As much as I'd prefer it to be otherwise, the frequent commenters on this blog are overwhelmingly male - over the history of the blog, of commenters where gender can be identified, the comments are over 90% male. Similarly, in my career as an engineer, the population of my coworkers has been very, very skewed: the engineering population at my workplaces has varied, but I've never worked anywhere where the population of engineers and engineering managers was less than 80% male.

But according to my recent trollish commenters, I'm supposed to believe that suddenly that population has changed, dramatically. Suddenly, every single comment is being posted by a woman who has never seen any male-on-female sexual harassment, but who was a personal witness of multiple female engineering managers who sexually harassed their male employees without any repercussions. It's particularly amusing, because those rants about the evil sexually-harassing female managers are frequently followed by rants about how the problem is the difference in sexual drive between men and women. Funny how women just aren't as sexually motivated as man, and that's the cause of the problem, but there are all of these evil female managers sexually harassing their employees despite their inferior female sexual drive, isn't it?

Um, guys?! You're not fooling me. You're not fooling anyone. I'm not obligated to provide you with a forum for your lies. So go away, find someplace else. Or feel free to keep submitting your comments, but know that they're going to wind up in the bit-bucket.

14 responses so far

It's easy to not harass women

(by MarkCC) Oct 16 2013

For many of us in the science blogging scene, yesterday was a pretty lousy day. We learned that a guy who many of us had known for a long time, who we'd trusted, who we considered a friend, had been using his job to sexually harass women with sleezy propositions.

This led to a lot of discussion and debate in twitter. I spoke up to say that what bothered me about the whole thing was that it's easy to not harass people.

This has led to rather a lot of hate mail. But it's also led to some genuine questions and discussions. Since it can be hard to have detailed discussions on twitter, I thought that I'd take a moment here, expand on what I meant, and answer some of the questions.

To start: it really is extremely easy to not be a harasser. Really. The key thing to consider is: when is it appropriate to discuss sex? In general, it's downright trivial: if you're not in a not in private with a person with whom you're in a sexual relationship, then don't. But in particular, here are a couple of specific examples of this principle:

  • Is there any way in which you are part of a supervisor/supervisee or mentor/mentee relationship? Then do not discuss or engage in sexual behaviors of any kind.
  • In a social situation, are you explicitly on a date or other romantic encounter? Do both people agree that it's a romantic thing? If not, then do not discuss or engage in sexual behaviors.
  • In a mutually understood romantic situation, has your partner expressed any discomfort? If so, then immediately stop discussing or engaging in sexual behaviors.
  • In any social situation, if a participant expresses discomfort, stop engaging in what is causing the discomfort.

Like I said: this is not hard.

To touch on specifics of various recent incidents:

  • You do not meet with someone to discuss work, and tell them about your sex drive.
  • You do not touch a students ass.
  • You do not talk to coworkers about your dick.
  • You don't proposition your coworkers.
  • You don't try to sneak a glance down your coworkers shirt.
  • You don't comment on how hot your officemate looks in that sweater.
  • You do not tell your students that you thought about them while you were masturbating.

Seriously! Is any of this difficult? Should this require any explanation to anyone with two brain cells to rub together?

But, many of my correspondants asked, what about grey areas?

I don't believe that there are significant grey areas here. If you're not in an explicit sexual relationship with someone, then don't talk to them about sex. In fact, if you're in any work related situation at all, no matter who you're with, it's not appropriate to discuss sex.

But what about cases where you didn't mean anything sexual, like when you complimented your coworker on her outfit, and she accused you of harassing her?

This scenario is, largely, a fraud.

Lots of people legitimately worry about it, because they've heard so much about this in the media, in politics, in news. The thing is, the reason that you hear all of this is because of people who are deliberately promoting it as part of a socio-political agenda. People who want to excuse or normalize this kind of behavior want to create the illusion of blurred lines.

In reality, harassers know that they're harassing. They know that they're making inappropriate sexual gestures. But they don't want to pay the consequences. So they pretend that they didn't know that what they were doing wrong. And they try to convince other folks that you're at risk too! You don't actually have to be doing anything wrong, and you could have your life wrecked by some crazy bitch!.

Consider for a moment, a few examples of how a scenario could play out.

Scenario one: woman officemate comes to work, dressed much fancier than usual. Male coworker says "Nice outfit, why are you all dressed up today?". Anyone really think that this is going to get the male coworker into trouble?

Scenario two: woman worker wears a nice outfit to work. Male coworker says "Nice outfit". Woman looks uncomfortable. Man sees this, and either apologizes, or makes note not to do this again, because it made her uncomfortable. Does anyone really honestly believe that this, occurring once, will lead to a formal accusation of harassment with consequences?

Scenario three: woman officemate comes to work dressed fancier than usual. Male coworker says nice outfit. Woman acts uncomfortable. Man keeps commenting on her clothes. Woman asks him to stop. Next day, woman comes to work, man comments that she's not dressed so hot today. Anyone think that it's not clear that the guy is behaving inappropriately?

Scenario four woman worker wears a nice outfit to work. Male coworker says "Nice outfit, wrowr", makes motions like he's pawing at her. Anyone really think that there's anything ambiguous here, or is it clear that the guy is harassing her? And does anyone really, honestly believe that if the woman complains, this harasser will not say "But I just complimented her outfit, she's being oversensitive!"?

Here's the hard truths about the reality of sexual harassment:

  • Do you know a professional woman? If so, she's been sexually harassed at one time or another. Probably way more than once.
  • The guy(s) who harassed her knew that he was harassing her.
  • The guy(s) who harassed her doesn't think that he really did anything wrong.
  • There are a lot of people out there who believe that men are entitled to behave this way.
  • In order to avoid consequences for their behavior, many men will go to amazing lengths to deny responsibility.

The reality is: this isn't hard. There's nothing difficult about not harassing people. Men who harass women know that they're harassing women. The only hard part of any of this is that the rest of us - especially the men who don't harass women - need to acknowledge this, stop ignoring it, stop making excuses for the harassers, and stand up and speak up when we see it happening. That's the only way that things will ever change.

We can't make exceptions for our friends. I'm really upset about the trouble that my friend is in. I feel bad for him. I feel bad for his family. I'm sad that he's probably going to lose his job over this. But the fact is, he did something reprehensible, and he needs to face the consequences for that. The fact that I've known him for a long time, liked him, considered him a friend? That just makes it more important that I be willing to stand up, and say: This was wrong. This was inexcusable. This cannot stand without consequences..

74 responses so far

Combining Non-Disjoint Probabilities

(by MarkCC) Sep 29 2013

In my previous post on probability, I talked about how you need to be careful about covering cases. To understand what I mean by that, it's good to see some examples.

And we can do that while also introducing an important concept which I haven't discussed yet. I've frequently talked about independence, but equally important is the idea of disjointness.

Two events are independent when they have no ability to influence one another. So two coin flips are independent. Two events are disjoint when they can't possibly occur together. Flipping a coin, the event "rolled a head" and the event "rolled a tail" are disjoint: if you rolled a head, you can't roll a tail, and vice versa.

So let's think about something abstract for a moment. Let's suppose that we've got two events, A and B. We know that the probability of A is 1/3 and the probability of B is also 1/3. What's the probability of A or B?

Naively, we could say that it's P(A) + P(B). But that's not necessarily true. It depends on whether or not the two events are disjoint.

Suppose that it turns out that the probability space we're working in is rolling a six sided die. There are three basic scenarios that we could have:

  1. Scenario 1: A is the event "rolled 1 or 2", and B is "rolled 3 or 4". That is, A and B are disjoint.
  2. Scenario 2: A is the event "rolled 1 or 2", and B is "rolled 2 or 3". A and B are different, but they overlap.
  3. Scenario 3: A is the event "rolled 1 or 2", and B is the event "rolled 1 or 2". A and B are really just different names for the same event.

In scenario one, we've got disjoint events. So P(A or B) is P(A) + P(B). One way of checking that that makes sense is to look at how the probability of events work out. P(A) is 1/3. P(B) is 1/3. The probability of neither A nor B - that is, the probability of rolling either 5 or 6 - is 1/3. The sum is 1, as it should be.

But suppose that we looked at scenario 2. If we made a mistake and added them as if they were disjoint, how would things add up? P(A) is 1/3. P(B) is 1/3. P(neither A nor B) = P(4 or 5 or 6) = 1/2. The total of these three probabilities is 1/3 + 1/3 + 1/2 = 7/6. So just from that addition, we can see that there's a problem, and we did something wrong.

If we know that A and B overlap, then we need to do something a bit more complicated to combine probabilities. The general equation is:

\[ P(A cup B) = P(A) + P(B) - P(A cap B)\]

Using that equation, we'd get the right result. P(A) = 1/3; P(B) =
1/3; P(A and B) = 1/6. So the probability of A or B is 1/3 + 1/3 - 1/6 = 1/2. And P(neither A nor B) = P(4 or 5 or 6) = 1/2. The total is 1, as it should be.

From here, we'll finally start moving in to some more interesting stuff. Next post, I'll look at how to use our probability axioms to analyze the probability of winning a game of craps. That will take us through a bunch of applications of the basic rules, as well as an interesting example of working through a limit case.

And then it's on to combinatorics, which is the main tool that we'll use for figuring out how many cases there are, and what they are, which as we've seen is an essential skill for probability.

2 responses so far

Weekend Recipes: Chicken Wings with Thai Chile Sauce

(by MarkCC) Sep 08 2013

In my house, chicken wings are kind of a big deal. My wife doen't know how to cook. Her cooking is really limited to two dishes: barbecued chicken wings, and grilled cheese. But her chicken wings are phenomenal. We've been married for 20 years, and I haven't found a wing recipe that had the potential to rival hers.

Until now.

I decided to try making a homemade thai sweet chili sauce, and use that on the wings. And the results were fantastic. Still not quite up there with her wings, but I think this recipe has the potential to match it. This batch of wings was the first experiment with this recipe, and there were a couple of things that I think should be changed. I wet-brined the wings, and they ended up not crisping up as well as I would have liked. So next time, I'll dry-brine. I also crowded them a bit too much on the pan.

When you read the recipe, it might seem like the wings are being cooked for a long time. They are, but that's a good thing. Wings have a lot of fat and a lot of gelatin - they stand up to the heat really well, and after a long cooking time they just get tender and their flavor concentrates. They don't get tough or stringy or anything nasty like a chicken breast would cooked for this long.

The Sauce

The sauce is a very traditional thai sweet chili. It's a simple sauce, but it's very versatile. It's loaded with wonderful flavors that go incredibly well with poultry or seafood. Seriously delicious stuff.

  • 1 cup sugar.
  • 1/2 cup rice vinegar.
  • 1 1/2 cup water.
  • 1 teaspoon salt.
  • 2 tablespoons fish sauce.
  • Finely diced fresh red chili pepper (quantity to taste)
  • 5 large cloves garlic, finely minced.
  • 1/2 teaspoon minced ginger.
  • 1 tablespoon of cornstarch, mixed with water.
  1. Put the sugar, salt, vinegar, water, and fish sauce into a pot, and bring to a boil.
  2. Add the garlic, ginger, and chili pepper. Lower the heat, and let it simmer for a few minutes.
  3. Leave the sauce sitting for about an hour, to let the flavors of the spices infuse into the sauce.
  4. Taste it. If it's not spicy enough, add more chili pepper, and simmer for another minute or two.
  5. Bring back to a boil. Remove from heat, and mix in the cornstarch slurry. Then return to the heat, and simmer until the starch is cooked and the sauce thickens.

The sauce is done.

The wings

  • About an hour before you want to start cooking, you need to dry-brine the wings. Spread the wings on a baking sheet. Make a 50-50 mixture of salt and sugar, and sprinkle over the wings. Coat both sides. Let the wings sit on the sheet for an hour. After they've sat in the salt for an hour, rinse them under cold water, and pat them dry.
  • Lightly oil a baking sheet. Put the wings on the sheet. You don't want them to be too close together - they'll brown much better if they have a bit of space on the sides.
  • Put the baking sheet full of wings into a 350 degree oven. After 30 minutes, turn them over, and back for another 30 minutes.
  • Now it's time to start with the sauce! With a basting brush, cover the top side with the sweet chile sauce. Then turn the wings over, and coat the other side. Once they're basted with the sauce, it's back into the oven for another 30 minutes.
  • Again, baste both sides, and then back into the oven for another 30 minutes with the second side up.
  • Take the wings out, turn the oven up to 450. Baste the wings, and then put them back in until they turn nice and brown on top. Then turn them, baste them again, and brown the other side.
  • Time to eat!

6 responses so far

« Newer posts Older posts »