The thing is, you have to agree this beforehand - 10 out of 10 or 18 out of 20 can be mutually agreed as the level of performance which constitutes a pass. Less than this is a fail. Once that has been agreed (and if the applicant tries to insist that 16 out of 20 should be a pass, you have to explain why you don't buy that), then the outcome is self-evident. He either achieves it or he doesn't.
Clear enough?
I understand what you are saying, and I know that's how Randi runs his tests, but it seems to me that that is completely the wrong way to do it. It seems to me absurd that with a passmark of 80% and a score of anything up to 79% Randi would declare a failure.
There should be three possible results: definite pass, definite failure, or more tests required.
Example:
http://www.skeptics.com.au/journal/divining.htm
In this dowsing test, dowsers had a 1 in 10 chance of guessing the correct pipe, so you would expect a 10% hit rate. In fact they scored 22% on the water test.
This is a lot lower than the passmark of 80%, but a lot higher than chance, the odds against it are about 107-1. Statistically speaking we can say that there is a better than 99% chance that there is some real dowsing effect here.
When you get results of this type, you have to decide whether its real, or just a fluke. So you should run further tests to see if the results are replicated. If you run the test several times, and the dowsers score above 20% on every test, that would be conclusive evidence of dowsing. The fact that they boast of 100% success rate wouldn't matter.
In your hypothetical example, with a score of 16 out of 20, again that would be significantly higher than chance, I would want to see if he can repeat it with the same result.
That's how the test
should be run.
Randi, of course, won't do that.