• Quick note - the problem with Youtube videos not embedding on the forum appears to have been fixed, thanks to ZiprHead. If you do still see problems let me know.

JREF Challenge Statistics

You've been answered and dismissed.
Well, this time you are at least half right.

Does anyone else reading this thread, who might have understood TC's "answer" better than I did, want to take a stab at applying it to the concrete example we now have to work with?
 
T'ai Chi said:
Perhaps.

In same cases if there is a super tiny p-value (what do we have here, 1 x 10^-12 or something?) it is difficult to say without better designed studies.

petre said:
So from this, can we assume that you've finally accepted JREF tests are not suited to the purpose you desire, and as such you now look elsewhere?

You, as before, are free to assume what you'd like.

I'm not sure how the above follows.

Well, when you said "here", perhaps I was assuming too much when I guessed that you were refering to the offered statistics on JREF tests. You appeared to be commenting that something was "difficult to say" without "better designed studies". Therefore, I was asking if you were now going to depart in search of those "better designed studies" that would be less "difficult to say" whatever it is you wish to say about whatever it is you are actually interested in.
 
I got a PM today asking me if I kept any notes on what I reviewed for the table I produced. The answer is that I spreadsheeted all the claims in Excel (which retained the URL to the posts) and there are many more classifications and "scoring" that wasn't presented in the table.

I found that the results might serve as a small insight into the preliminary apllication proces - well certainly intrigued me.

I'm more than happy to supply the first pass file to anyone interested (PM me), but I think I'll do a second pass and apply more rigour to my scoring/classification process. Once the claim process is back up and running when James has recovered his health and his fire, I think it would be very interesting to keep tabs on the process, especially where it breaks down, and especially the *reasons* the process breaks down.

Give me some time to refine my review process - and I'll make it available to all. As I said, if anyone wants the first pass Excel for a reference - PM me, but the 2nd pass will probably be more useful.
 
Last edited:
We look forward to your 2nd pass.

If you find the time and energy, could you keep an eye on the "tests that are statistical in nature", which T'ai Chi mentioned in his OP?

'Bout time we got back to constructive grounds here. After all, some people in this thread have a lot going for them. (My math skills suffice only to correctly count my change in stores.)
 
We look forward to your 2nd pass.

If you find the time and energy, could you keep an eye on the "tests that are statistical in nature", which T'ai Chi mentioned in his OP?

'Bout time we got back to constructive grounds here. After all, some people in this thread have a lot going for them. (My math skills suffice only to correctly count my change in stores.)
That's indeed what I want to fix in my second pass.

I got so disenchanted with the number of claims that didn't even get to the protocol stage, that I stopped bothering with noting the proposed targets and agreed targets.

One of the things that may prove interesting is to see what the original claimed success rate was and what the protocol process decided was a reasonable success rate.

Hopefully I'll get it done by the time the "claims department" is up and running again. A little welcome back present for James, perhaps.
 
From my page
I've always been interested in data from the past as well.

You did not state that *here*, in your original question about determining the % of female applicants. If you're going to discuss things here, don't expect us to randomly look at your other writings not posted here to fill in gaps of your conversation.

So, again, why do you assume I did not calculate what you are demanding? And why do you think what you think I calculated or not matters to proposing the idea of seeing interesting data?

I *inferred* that you'd not calculated this because (a) you claimed it seemed extraordinarily hard to do (b) you asked the JREF to provide more information (c) you've never said that you have calculated it and (d) you've never posted the number you (now appear to claim to have) calculated.

I notice that you've not actually answered my question -- do you now claim to have calculated the % of female applicants? yes or no? You're continuing in your now apparently usual method of evading the question. You could quite easily clarify the matter by either saying that you have not calculated this percentage (that *you* *yourself* claim to be an interesting data point), or by saying that you have calculated it and found it to be X.

Come on TC, answer the direct question.
 
Well, most of the tests with dousers have a preliminary alpha of what, .001? Let's assume that we set the alpha to about this level for the other test that involve effective randomness.

The chance of the next N tests all returning 'not true' is 1-(1-.001)^N.

The expectation value for the number of successes in the next N tests is simply .001N.

I'm assuming, of course, that there is no cheating, and that there aren't any real dousers.

This means that, we'd expect that in the next 1000 tests, there would be a greater than 50% chance that there would be at least one winner. (The 50% mark is reached after the 693rd test is done)

Obviously, we're not counting tests like, "I have a perpetual motion machine" which have no dealing with probability.
 
Well, most of the tests with dousers have a preliminary alpha of what, .001? Let's assume that we set the alpha to about this level for the other test that involve effective randomness.

Unfortunately -- as has been repeatedly pointed out -- this is not a justifiable assumption.

The nominal 0.001 cutoff is simply a typical maximum that the JREF will accept (to keep them from having to deal with paranormal claims like "I can predict the suit of the next card in a deck, as long as I don't have to repeat the trial." But it's not at all clear that all, or even "most" of the tests use this cutoff. A relatively recent challenger, for example, claimed that he could make it snow in Oakland, CA, on a specific July date. I can certainly establish that the probability of this happening is less than 0.001 (it hasn't happened yet, and there have been more than 1000 July days in the past hundred or so years) -- but it's sufficiently less than 0.001 that we can't talk meaningfully about the expected number of successes.....

In point of fact, most of the the JREF challenge communications don't get to the point where I can assess what the actual probability of success is. The applicants aren't able to define their protocols well enough. If we have a thousand applicants, and of those thousand, 998 never get to the point of taking the test, what can we say about the expected number of successes?
 
The nominal 0.001 cutoff is simply a typical maximum that the JREF will accept (to keep them from having to deal with paranormal claims like "I can predict the suit of the next card in a deck, as long as I don't have to repeat the trial." But it's not at all clear that all, or even "most" of the tests use this cutoff.

Ask Chip Denman, a statistician who consults with JREF, about what alpha JREF typically uses for tests that are statistical in nature.
 
"dr" said

But it's not at all clear that all, or even "most" of the tests use this cutoff.

I suggested how he could learn where to get that info. I'm not repeatedly demanding anyone answer a question.
 

Back
Top Bottom