I'm not too happy with this answer. The appropriate denominator goes to infinity as the sample size goes to infinity, though much more slowly to be sure.Originally posted by T'ai Chi
1. For a mound-shaped distribution, what is a decent way to estimate the standard deviation from the range?
Assuming approximate normality, take the range/4 for small data sets, or range/5 or range/6 for larger datasets.
What do you mean by "break down"? The maximum likelihood estimate of the coin's probability is indeed 1 and 0, respectively, in those cases. If those answers don't seem right, it means that your intuitive notion of "the most sensible estimate" does not correspond to the maximum likelihood estimate. So why use maximum likelihood in other cases? It's not as obviously wrong in those cases, but it still isn't "the most sensible estimate," I'd say. For one reason, it doesn't take into account any prior information you may have.9. If we only observe the outcomes from coin flipping (Heads = 1, Tails = 0): 1000100010101, what is the most sensible estimate of the probability of Heads? What general mathematical technique would you use here?
5/13 would be a sensible estimate. Call each Head, a sucess, p, and each tail, a failure, as 1-p. From our string of 1's and 0's, our likelihood function (multiplying them all together) is p^5*(1-p)^8. Differentiating this and setting it equal to 0, yields p^=5/13. (of course, you still have to show it is a maximum)
The question arose of 'What do we do if we get a string like 111111, or 0000? Does the above Maximum Likelihood estimation method break down? There is something called a Wilson estimate that overcomes this. It says that instead of estimating p as X/n, where X is the number of successes, it estimates p as p=(X+2)/(n+4).
I'm not sure how (X+2)/(n+4) was arrived at. If the question is "what's the probability that the next toss will yield heads?" (which is not exactly the same question as "what's the best estimate of the coin's 'true' probability?"), Laplace's rule of succession gives (X+1)/(n+1). That assumes a uniform prior for the coin's probability. In other words, we suppose that all the possible coin probabilities, from 0 to 1, are equally likely a priori. (This is unlikely to correctly represent our state of knowledge about a real coin, of course.)
For a given range, the distribution with the largest variance is the one where all the probability is as far from the mean as possible. Without loss of generality, take the mean to be zero. If the range is 2r, place half the probability at -r and the other half at r. In other words, consider a random variable X with P(X = -r) = P(X = r) = 1/2. The standard deviation is r, so range/SD = 2.30. For what distributions is range/SD >= sqrt(2) ?
All of them. I found this in my notes from a theory class (without proof). I'm trying to prove it, but am having a hard time.
I'm not sure why your notes say sqrt(2). Have I made a mistake? Can anyone give a distribution where the ratio is less than 2?