• Quick note - the problem with Youtube videos not embedding on the forum appears to have been fixed, thanks to ZiprHead. If you do still see problems let me know.

Always 50/50 chance?

All right, this second example is complicated. I've enclosed it in spoiler tags, because the work required it's very LaTeX-heavy, and the formulas involved are lengthy. If you're into probability, give it a look-see.

The probably of a red shirt surviving an episode of the original Star Trek:

Please note the following definitions:

latex.php



What if we use the Scotty ratio function of fs(OB:TS:TN), where
OB = Overblown timeframe Scotty originally quotes the Captain
TS = Time it really takes Scotty to get the job done
TN = Time it takes a normal engineer to do the same thing

How will that change the solution if Scotty gives an estimate of 10 minutes until the engines explode?
 
What if we use the Scotty ratio function of fs(OB:TS:TN), where
OB = Overblown timeframe Scotty originally quotes the Captain
TS = Time it really takes Scotty to get the job done
TN = Time it takes a normal engineer to do the same thing

How will that change the solution if Scotty gives an estimate of 10 minutes until the engines explode?

Hah, trying to trip me up now, I see!

The only effect it has is that the probability ratio is adjusted to Pr(Sr)1 = .20 and Pr(Sr)2 = .80 where
Pr(Sr)1 is the probability the red shirt dies in the first half of the show, and Pr(Sr)2 is the probability he dies in the second half.

Also, the Enterprise will warp out of danger approximately 11 seconds before the Klingons finish off the damaged warp engine with their disrupter beams.
 
Hah, trying to trip me up now, I see!

The only effect it has is that the probability ratio is adjusted to Pr(Sr)1 = .20 and Pr(Sr)2 = .80 where
Pr(Sr)1 is the probability the red shirt dies in the first half of the show, and Pr(Sr)2 is the probability he dies in the second half.

Also, the Enterprise will warp out of danger approximately 11 seconds before the Klingons finish off the damaged warp engine with their disrupter beams.

What if we now learn that the RedShirt has a Tribble fetish. Will this new knowledge change the survival function, or are the two events independent of each other?
 
Last edited:
What if we now learn that the RedShirt has a Tribble fetish. Will this new knowledge change the survival function, or are the two events independent of each other?

As that was only covered in one episode, I think it's fair to say we're dealing with an outlying data point that should be excluded.

Besides, who doesn't have a Tribble fetish? I thought that was the whole point!
 
Let me talk you through an application of Bayesian probability to computer vision, feel free to point out the point where human intuition kicks in.

We want to find all of the grass in a photo.
For tractability, we model these images as a Markov random field where each pixel is connected to it's nearest neighbours.

We set the unary potentials of each pixel by looking at the probability that a pixel of that colour, and in that location was grass by looking at the frequency of occurrences in a training set with labelled ground truth.

We form the pairwise links between pixels similarly, learning how often two pixels with a similar colour difference between them and in a similar location share the same label.


What if you don't have any photos to use as a training set? What probabilities do you use then?

What if you have only a single photo in your training set? Would you say that, if a pixel is grass in the training photo, the probability is 1 that the same pixel is grass in the photo you're analyzing, provided the two pixels have the same color? And that in all other cases the probability is 0?

In order to get anywhere, you need some prior probability distribution before you even start looking at the training photos.

(Somewhat related: why is there an n - 1 in the denominator instead of an n, when estimating the standard deviation from a sample? Because in general a sample isn't perfectly representative of the population. Same for the training set vs. "the real probabilities".)
 
What if you don't have any photos to use as a training set? What probabilities do you use then?

What if you have only a single photo in your training set? Would you say that, if a pixel is grass in the training photo, the probability is 1 that the same pixel is grass in the photo you're analyzing, provided the two pixels have the same color? And that in all other cases the probability is 0?
Formally speaking, you should use a uniform distribution to represent the total absense of knowledge about the situation, else you could end up taking a dutch book.*

In the case of very small data sets I recomend using Laplaces' rule of succession which effectively assumes that both grass and not grass has appeared once everywhere when you weren't looking, this stops you overstating your knowledge of the system.

I'm not arguing that the assignation of the probabilities could not be done better, given an effective apriori distribution. I'm showing Lenny how a decent Bayesian estimation can be arrived at without using human intuition.
*Disclaimer:I'm only talking about finite, discrete states here.
 
Last edited:
why be so disparaging other people's posts? Most people spend their lives thinking in words, to understand statistics you need to think in maths - it's just another type of language. And like any language, you need to spend time learning its application. You may find it counter-intuitve - it doesn't mean that counter-intuitiveness is a universal attribute.

I wasn't disparaging his post. You're reading the wrong tone into it, which is easy enough to do given the communicative limits of natural written language. Admittedly I'm not adding [tone] [/tone] tags to every sentence so please read in a non-disparaging tone to all of my past and future posts. I'll leave the present post as an exercise for the reader. :p
 
What if you don't have any photos to use as a training set? What probabilities do you use then?

What if you have only a single photo in your training set? Would you say that, if a pixel is grass in the training photo, the probability is 1 that the same pixel is grass in the photo you're analyzing, provided the two pixels have the same color? And that in all other cases the probability is 0?

In order to get anywhere, you need some prior probability distribution before you even start looking at the training photos.

(Somewhat related: why is there an n - 1 in the denominator instead of an n, when estimating the standard deviation from a sample? Because in general a sample isn't perfectly representative of the population. Same for the training set vs. "the real probabilities".)

One could ask the same questions about how the human ability to identify grass develops from first consciousness. Or alternatively how the human ability has developed culturally over time (or genetically if it's partially hard-wired). To the degree that a bayesian (or Laplacean, etc.) process is more efficient than human processes without these developed techniques could serve as a measure for how they better or more efficiently model apparent reality.
 

Back
Top Bottom