Tuesday, April 15, 2008

Words at random, carefully chosen

On comp.ai, Dmitry Kazakov reiterated the lonely cry of a frequentist against statistical natural language. This cry has been repeated many times over the years by many people who cannot abide the treatment of documents and language as if they were random.

Let's examine the situation more carefully.

On Apr 14, 5:28 am, "Dmitry A. Kazakov" wrote:
> ... It cannot be probability because the document is obviously not random ...

The statement "It cannot be probability ..." is essentially a tautology. It should read, "We cannot use the word probability to describe our state of knowledge because we have implicitly accepted the assumption that probability cannot be used to describe our state of knowledge".

The fact that an object has been constructed in its present state by non-random processes outside our ken is no different as far as we can tell than if the object were constructed at random (note that random does not equal uniform). What if the document were, in fact, written using the I Ching (as Philip K Dick is reputed to have written "The Man in the High Castle")? Is it reasonable to describe the text as having been randomly generated now that we know that?

Take the canonical and over-worked example of the coin being flipped. Before the coin is flipped a reasonable observer who knows the physics of the situation and who trusts the flipper would declare the probability of heads to be 100%. After the coin is flipped, but before it is revealed, the situation is actually no different. Yes, the coin now has a state whereas before the coin was only going to have a state, but, in fact, the only real difference is that the physics has become somewhat simpler, the most important factor in our answering the question of the probability has not changed. We still do not know the outcome.

Moreover, if the person flipping the coin looks at the coin, that does not and cannot change our answer.

When WE look at the coin, however, we now suddenly, miraculously declare that the probability is now 100% that the coin has come up heads. Nothing has changed physically, but our estimate has changed dramatically.

Moreover, if we now examine the coin and find that it has two heads, our previous answer of 50% is still valid in the original context. If we were to repeat the experiment, our correct interpretation is to give 100% as the probability before the flip. The only difference is our state of knowledge.

So philosophically speaking, probability is a statement of knowledge.

Moreover, by de Finetti's famous theorem, even if this philosophical argument is bogus, the mathematics all works our AS IF there were an underlying distribution on the parameters of the system. That means that we can profitably use this philosophical argument AS IF it were true.

The upshot is that even if you are a frequentist in your heart of hearts, it will still pay to behave as if you were a Bayesian. And I, as a Bayesian, will be able to behave as if you were rational because I will not know your secret.

No comments: