So are people for seeing actual data in a summarized format from skeptical organizations that do such tests?
No, because the inevitable (over)simplifications necessary to put the test data into summarized format will result in the summaries being useless and actively misleading.
DrKitten is quite right, the attempt to combine such disparate studies is likely to be worse than useless.
Why though?
If you have 20 dowsing experiments, done similarly (choosing gold under a cup, etc.) seems very reasonable to combine the results.
Seems to work in every other field, why not with skeptical organizations?
In addition, we may have one tested against a claim of 100% accuracy, another tested against a claim of 90%, another against a claim of 60%... Each of these may (depending on the deal agreed to by both parties) result in a different cutoff level, which cannot be combined in a meaningful manner.Because we don't have 20 dowsing experiments done simlarly. We have one dowsing experiment finding gold under a cup, one dowsing experiment finding addresses with a pendulum, one telepathy experiment sending thoughts to another person, one martial arts experiment attempting to stop an attacker without touching him... How do you combine those results?
By the way, can you tell me what element all of those tests had in common?
Because we don't have 20 dowsing experiments done simlarly. We have one dowsing experiment finding gold under a cup, one dowsing experiment finding addresses with a pendulum, one telepathy experiment sending thoughts to another person, one martial arts experiment attempting to stop an attacker without touching him... How do you combine those results?
In addition, we may have one tested against a claim of 100% accuracy, another tested against a claim of 90%, another against a claim of 60%...
What the claimant claims has a direct bearing on the test; it may determine that a test end as a failure in one case with results that were a small fraction of the required attempts of another case. Thus what a claimant believes about their perfomance has a direct bearing on how they will be allowed to perform in the test (that is, a person claiming 90% accuracy, who agrees to a 20-trial test, will fail a preliminary even if they score slightly above chance. Let us suppose that they score at a percentage rate that would make their performance statistically significant if they maintain it for only 100 trials; the problem is, their trial ended after 20 trials. It is impossible to know whether they would continue, or whether they would regress to the mean.)What a claimant believes about their performance doesn't interest me, but how they actually perform.
(that is, a person claiming 90% accuracy, who agrees to a 20-trial test, will fail a preliminary even if they score slightly above chance.
You missed my point.I'm not interested in what a person believes about their performance, they could be mistaken, but how they actually perform, uch like I'm not interested in what a doctor thinks about a drug, but how the drug actually performs.
They served their purpose. They were not designed so serve yours.
Oh, heavens, let's not ever make claims...glad it was never claimed that they were claimed that they were designed to serve your purposes. Rather, you asked about combining data, and I did my best to try to help you understand why. That's all. No "claims" were made, so you can be safe.Glad it was never claimed they were designed to serve my purposes...
No. We compare what the actual claim is to what the claimant actually does. We do not compare what we expect to see. There is a world of difference. We could, very easily, do the latter. The former has turned out to be considerably easier and quicker to do.In any case, in these tests, one compares what one expects to what the claimant actually does. You then measure the difference numerically to see if it is significantly far away.
Perhaps. It would certainly be helpful to classes like mine. Even more helpful would be access of this sort to the raw data from the parapsychologists' labs. Doesn't Schwartz have some? (Maybe my memory is playing tricks). I know the most recent Bem precognition data would be a great set to do a time-series analysis on, to see if inadequate randomization predicts performance through a classical conditioning mechanism. If I am not mistaken, this database would be significantly larger and better controlled, since (in theory) the tests are not against claims but against chance. Of course, I would like to even see such experiments videorecorded (no, I am not holding them to a higher standard than, say, psych experiments; I would like to see a video archive for psych experiments as well) so that experimental methodology can be examined in a bit more detail than an article's methods section can manage.Again, the issue of combining experiments aside, wouldn't it still be nice to see a list of such statistical results from the perliminary experiments, all in one place, from various skeptical organizations, without having to fly to the organizations to read through papers, made available to all interested parties, say, over the internet? Even something absurdly simple, like how many of the preliminary tests were on dowsers? Out of those, how many tested higher than what one would expect? Etc. Basic info interested parties would hope to find.
No. We compare what the actual claim is to what the claimant actually does. We do not compare what we expect to see. There is a world of difference. We could, very easily, do the latter.
We compare what the actual claim is to what the claimant actually does. We do not compare what we expect to see.
Let us take the extreme example in which claimants say they have complete control, and will always be able to determine the coin's face. We can test this very easily--just start flipping. There is a .5 probability that (by chance alone) any given person will fail after one toss. But that person can stop then. The trial is over. If, on the other hand, the person got the first one right, then there is a .5 probability on the next trial (again, by chance alone). With enough claimants, we may have some people who are getting 5, or 10, or more coins called correctly before making a mistake (this all by chance alone--of course, if they *can* influence the outcome perfectly, they will never make the mistake. And yes, I recall that I am taking the extreme 100% position here, but it extrapolates to lesser claims).
In this scenario, in order to do the test one assumes that the person making the claim is truthful about their claimed abilities. Unfortunately, that is the very thing one is trying to ascertain by the test in the first place. Perhaps they really only perform at the 90% level, or at the 70% level, or some other level.
It doesn't make much sense to say 'OK, since you're saying you perform at the K% level, we'll test you at that level, and if you don't perform at it, you're wrong.' It makes sense to say 'We know with regular coins we'd expect you to perform at the 50% level, and if you don't perform siginificantly away from this, you're wrong.'
Doing so makes the test longer, harder, and more expensive by requiring more trials.It doesn't make much sense to say 'OK, since you're saying you perform at the K% level, we'll test you at that level, and if you don't perform at it, you're wrong.' It makes sense to say 'We know with regular coins we'd expect you to perform at the 50% level, and if you don't perform siginificantly away from this, you're wrong.'
In this scenario, in order to do the test one assumes that the person making the claim is truthful about their claimed abilities. Unfortunately, that is the very thing one is trying to ascertain by the test in the first place.
Perhaps they really only perform at the 90% level, or at the 70% level, or some other level.
It doesn't make much sense to say 'OK, since you're saying you perform at the K% level, we'll test you at that level, and if you don't perform at it, you're wrong.'