• Quick note - the problem with Youtube videos not embedding on the forum appears to have been fixed, thanks to ZiprHead. If you do still see problems let me know.

Randi on Sheldrake - small sample size?

davidsmith73

Graduate Poster
Joined
Jul 25, 2001
Messages
1,697
Perhaps quite an important point in light of Randi's criticism.

Randi says the sample sizes in Sheldrakes latest email telepathy experiment was small.

Only 50 people participated. True.

However, each person did about 10 trials, making the total number trials 552. The exact binomial test was used on this data set to calculate the 43% hit rate.

Is 552 trials a small sample size?
 
Perhaps quite an important point in light of Randi's criticism.

Randi says the sample sizes in Sheldrakes latest email telepathy experiment was small.

Only 50 people participated. True.

However, each person did about 10 trials, making the total number trials 552. The exact binomial test was used on this data set to calculate the 43% hit rate.

Is 552 trials a small sample size?

Is this the July '03 paper?
http://www.sheldrake.org/Articles&Papers/papers/telepathy/pdf/experiment_tests.pdf

IMO, the principle flaw with the experiment is not necessarily the statistics (which I didn't check) but that they didn't ask the people to taking part to record their guesses prior to the call being made and only asked what there guess was afterwards. The whole experiment can be dismissed as an example of individual confirmation bias.
 
In Randi's book "Flim-Flam!", he writes quite a bit about insufficient controls. Not recording the 'guess' until *after* the phone was picked up is a real flaw in the experiment. The guess should have been recorded before the phone was picked up. Like most people, I can 'predict' who's calling but I 'know' before I even pick up the phone. Better yet, the phone should not have been answered at all to maintain a 'blind' experiment. The guessers and test observers should not know the results of their guess until the experiment is over. Besides, if there really are psychic forces involved, the psychic message is sent by the caller *before* the phone is even dialed!

Also, there is the possibility that the guessers were allowed a practice runs before the 'real' test began. If the practice test gives poor results, it is thrown out, but if the results are good, some experimenters include the run in their test data. Using practice runs in this manner greatly improves the results even if the rest of the experiment is perfectly valid.
 
I read through the 2003 paper and found something that didn't look right to me:

The experimenter (either R.S. or P.S.) telephoned the randomly selected callers in advance, usually an hour or two beforehand, and asked them to call at the time selected. We asked callers to think about the participant for about a minute before calling. We also rang the callers who had not been selected to tell them that they were not involved in this test
session.

So the authors of the paper were the ones who informed the callers that they would participate and informed those who were not going to be part of the test. A proper test would include controls to prevent the callers from contacting the guessers before the assigned call times but that did not happen.
A few minutes after the tests, the experimenter rang the participant to ask what his or her guess had been, and in some cases also asked the callers. In no cases did callers and participants disagree.

Some cases? Why not all cases?

I noticed many of the tests were incomplete. The subjects (guessers) were said to have not completed the tests for whatever reason. How do we know this is true? Is it possible that the experimenters threw out bad data points?

In March 2001, we started a second series of experiments in which the participants were asked to name only two or three out of the four potential callers. We did this for two reasons. First, many people could not find four familiar people able to take part in the trials. Hence it was easier to recruit participants when they had to find only two or three familiar people.

Some tests used only two or three possible callers, but the data calculates a success rate based on correct guesses out of the total. When only two or three callers are possible the success by random chance jumps to 33% or 50%! No mention is made of correcting the data by taking the number of possible callers into account.

Here's what the authors claim in their conclusion:
Combining the results of all our experiments, and adding in the trials conducted by Sam Bloomfield, there were 63 participants altogether. They made 231 correct guesses in 571 trials, a success rate of 40%, well above the mean chance expectation of 25%.

This is a completely invalid conclusion. Chance expectation was higher than 25% in the cases where there were only two or three possible callers. I looked through the numbers but I did not see which test subjects had less than four callers to choose from. Maybe believers in the paranormal like these researchers think this is an unimportant detail but it is enough of a problem to prevent this paper from being published in a peer reviewed journal.​
 
IMO, the principle flaw with the experiment is not necessarily the statistics (which I didn't check) but that they didn't ask the people to taking part to record their guesses prior to the call being made and only asked what there guess was afterwards.

Were they really that dumb?
 
Seriously? The way it was phrased suggested they did record it beforehand, but yeah, it wasn't stated outright.

In that case in what way was this even a test at all?
 
In March 2001, we started a second series of experiments in which the participants were asked to name only two or three out of the four potential callers. We did this for two reasons. First, many people could not find four familiar people able to take part in the trials. Hence it was easier to recruit participants when they had to find only two or three familiar people.
Some tests used only two or three possible callers, but the data calculates a success rate based on correct guesses out of the total. When only two or three callers are possible the success by random chance jumps to 33% or 50%! No mention is made of correcting the data by taking the number of possible callers into account.

I read this as the participants only specify 2 or 3 people of the 4 that may call not that only 2 or 3 people may call. This shouldn't change the odds.
 
I read this as the participants only specify 2 or 3 people of the 4 that may call not that only 2 or 3 people may call. This shouldn't change the odds.

I'm not sure how to read that, or how you are reading it. Do you mean that of the 4 possible callers, the participant guessed two or three names? That would improve the odds wouldn't it?

Or do you mean that of 4 possible callers, the participants were allowed to specify only two or three that were allowed to call? Which is the same as only starting with two or three possible callers.

Or did you mean something else that I didn't understand?
 
Found this rebuttal by Sheldrake:

http://thetyee.ca/Views/2006/07/26/Sheldrake/

Published in July of this year, Rupert Sheldrake offers a rebuttal from The Tyee. Props to them for publishing it, and I thought it was worth putting up since his study on "telephone-pathy" was apparently torn apart by Tyee writer Shannon Rupp. Her article is linked on Sheldrake's.
 
I'm not sure how to read that, or how you are reading it. Do you mean that of the 4 possible callers, the participant guessed two or three names? That would improve the odds wouldn't it?

Or do you mean that of 4 possible callers, the participants were allowed to specify only two or three that were allowed to call? Which is the same as only starting with two or three possible callers.

Or did you mean something else that I didn't understand?

I think there are still 4 possible callers, the participant may specify who two or three of them are, but still has to guess who it will be out of the four. The last caller may be someone he doesn't know, but he still has to guess if it is them calling.

Well, that's how I read it anyway.
 
Jekyll, you are reading that wrong. Here it is again:

First, many people could not find four familiar people able to take part in the trials. Hence it was easier to recruit participants when they had to find only two or three familiar people.

In the report, 'participants' are the people who make the guesses and 'callers' are the people who are chosen at random to call the participants. Callers are close friends or family members of the participants.

The researchers chose some participants who only knew two or three familiar people (callers). The odds for these participants would be 50% (two callers) and 33% (three callers).

Participants who had only two or three callers would have proportionally more correct guesses which will raise the percentage of correct guesses above the 25% level. The study concludes by calculating the ratio of correct guesses to total guesses for all participants. This ratio is then compared to 25%, which would be valid *all* participants were choosing from four callers. But since some only chose from two or three, the 25% number is the wrong number to use for this test.
 
Last edited:
First, many people could not find four familiar people able to take part in the trials. Hence it was easier to recruit participants when they had to find only two or three familiar people.

In the report, 'participants' are the people who make the guesses and 'callers' are the people who are chosen at random to call the participants. Callers are close friends or family members of the participants.

The researchers chose some participants who only knew two or three familiar people (callers). The odds for these participants would be 50% (two callers) and 33% (three callers).

Yeh, I see what you're saying but...
In our second series, we asked participants to nominate a minimum of two callers, and we supplied the others, who were strangers to the participants.
Top of page 2 under the heading callers.

If I was being cynical, I'd say that they were intentionally doing one big jumble of experiments so they could justify slicing the data any which way until they could find a projection of the data which appears to give anomalous results, but the sloppy handling of the experiment made that unnecessary so they mulched it all together.
 
I think there are still 4 possible callers, the participant may specify who two or three of them are, but still has to guess who it will be out of the four. The last caller may be someone he doesn't know, but he still has to guess if it is them calling.

Well, that's how I read it anyway.

The only callers in the test were people familiar to the participants. The purpose of the test is to determine if people who are known to each other are linked by psychic forces. Including unfamiliar callers in the test would invalidate the results because this would completely change the nature of the test. Once again:

First, many people [participants] could not find four familiar people [callers] able to take part in the trials. Hence it was easier to recruit participants when they had to find only two or three familiar people [callers].
 
Doh! I missed the part where strangers were supplied by the researchers! Thanks for pointing that out. As my punishment, I'm leaving my embarassing posts as-is instead of editing them.

But I will stand by my statement that including strangers invalidates the test. First, the study was to determine if people familiar with each other were linked by psychic forces. Even if you believe in psychic forces, using strangers as callers totally ruins the test. Also, assuming psychic forces do exist between familiar people, how am I supposed to know when a stranger is calling me? The phone rings and since I am not recieving psychic messages, it must be the stranger? Finally, suppose I'm one of the unpopular participants and I have either two friends or two strangers calling me. Again, I have a better than 25% chance of guessing correctly. I only have three possible responses Friend A, Friend B, or Stranger. I can make one Stranger guess and it will be correct in two of the four random possibilities.
 
Doh! I missed the part where strangers were supplied by the researchers! Thanks for pointing that out. As my punishment, I'm leaving my embarassing posts as-is instead of editing them.
Good for you. I really suck at admitting I'm wrong.

But I will stand by my statement that including strangers invalidates the test. First, the study was to determine if people familiar with each other were linked by psychic forces. Even if you believe in psychic forces, using strangers as callers totally ruins the test. Also, assuming psychic forces do exist between familiar people, how am I supposed to know when a stranger is calling me? The phone rings and since I am not recieving psychic messages, it must be the stranger? Finally, suppose I'm one of the unpopular participants and I have either two friends or two strangers calling me. Again, I have a better than 25% chance of guessing correctly. I only have three possible responses Friend A, Friend B, or Stranger. I can make one Stranger guess and it will be correct in two of the four random possibilities.

However, they might have named the strangers.
 
I don't know enough statistics to make authoritative comments. I have heard an expert statistician say that the sample size issue has to do with whether or not a study is actually "powered" to detect the effect the study is intended to detect...power being "statistical power", which term statisticians understand.

Beyond that I have no idea...the talk was all about hazard ratios and 95% confidence intervals and whether or not the people relying on the study knew a hazard ratio from a P value. Then the expert statistician put his briefcase down on a power strip and the projector died along with half the press-n-talk microphones and laptops. Funny thing is, the panel didn't seem to notice...
 
Sample size determines confidence interval

I appreciate how difficult this topic is. I struggled to explain every time when teaching the concept.

Here's what I believe the study claiming:
We are 95% confident the human population can correctly state who is calling before picking up the phone from 36% to 45% of the time.

They are stating the 95% confident interval in their thesis's abstract. The sample size and their assumptions about the variances in the human population's ability determine this interval. I believe that http://www.isixsigma.com/offsite.asp?A=Fr&Url=http://www.qualitydigest.com/may00/html/lastword.html provides a great discussion of the mathematics behind this effect.

Now, let me comment on Randi's concern about sample size. I offer that given the extraordinary claim that we should expect an extraordinary high confidence. 95% is the lowest I would accept in even a mundance situation. I suggest that a 99.5% confidence is a reasonable requirement before we even begin to agree that an extraordinary ability exists. I offer that Randi is saying in statistical terms is: "Until the researchers increase their sample size such that we know to 99.5% confidence that 25% is less than the human population's accuracy, we should dismiss the result out of hand."

As with any statistics, we give up accuracy whenever we attempt to made additional inferences. I worry that the researchers did not properly reduce their confidence as degrees of freedom of their variables diminished as they inferred more and more from the sample. For example, their inference that callers from abroad were less likely to be known beforehand should have reduced their confidence in their primary hypothesis. (As you mine more from the data, you exhaust what the data can do.)

I choose here to ignore the concern about the experimental protocol as off-topic, all the while appreciating the wonderful insights Forum posters have.

I hope that helps,
Gulliver
 
Last edited:

Back
Top Bottom