New telepathy test, the sequel.

JayUtah · Nov 2, 2022

Gamolon said:
You're skewing the test results in your favor by limiting participants to four sentences/answers. On top on that, you're giving them them the answers to choose from.

This is not ostensibly a problem. In the classic Zener card deck, for example, the participants have to choose from a small, well-defined set of drawn figures--five figures, five of each in a deck of 25. Later we can go down the rabbit hole of all the ways one could cheat at Zener guessing. But those aside, it's a valid test. The question is simply whether the guesses statistically fit an expected distribution. Having a finite set of possible outcomes allows us to create that model. Your suggestion does that too.

Having given the example of the Zener deck, I want to shift to a simpler protocol using a different variable. A typical Zener run does have a well-defined statistical basis, but it's a poor example for trying to explain the concept. It's not intuitive. So I'll pick one that is.

Consider instead a fair six-sided die. A single trial is a single fair throw of the die. The conductor throws the die, and visualizes the result. Separately, out of sight, the participant tries to guess which number came up on the die. In the single trial, the null hypothesis says the participant should guess correctly 1 out of every six times. Or rather, that the naked probability of a correct guess is p = 1/6.

But to test properly in statistics, we need a distribution. So we define a run as 60 trials. The score for that run is the number of times the participant correctly guessed the die number. In theory it will be 10 if the null holds. But in practice, even under the null, the scores from a series of runs will form a normal distribution around 10. Getting to that normal distribution is what's important.

We could define a run as 6 (cluster around 1), or as 600 trials (cluster around 100). The former gives us too few degrees of freedom. The distribution will cluster around 1, but since the numbers in the vicinity are 0, 1, and 2, it will be hard to see whether the actual experimental distribution fits the curve or not. 600 (clustered around 100) will let the data vary a lot more smoothly, but would probably be onerous for the participant. So let's say 60.

But we need more than one run. We might end up doing a large number of total trials, but they could be, say, 20 runs of 60 trials each. That gives us 20 data points for this participant. Then we try to fit the experimental data to a standard normal distribution with a mean of 10. If the result is p < 0.05 that the correctly parameterized normal curve can still explain the experimental data, then we will have shown a scientifically significant effect.¹
We won't have proven ESP. We will simply have shown that there is an effect that we can then study further. Michel isn't even to that stage. With proper controls in place, and a defensible (if simplistic) statistical model, he can't show that there is an effect. Without an effect to explain, it's meaningless to think about any possible cause.
_______________________
¹ But in this case the effect we're interested in is a shift in the μ > 10 direction. A shift in the other direction would possibly indicate a problem in the protocol. This is the correct way to diagnose methodology problems by looking at the results, not navel-gazing at subjective "credibility."

KAJ · Nov 2, 2022

JayUtah said:
<snip>
But to test properly in statistics, we need a distribution. So we define a run as 60 trials. The score for that run is the number of times the participant correctly guessed the die number. In theory it will be 10 if the null holds. But in practice, even under the null, the scores from a series of runs will form a normal distribution around 10. Getting to that normal distribution is what's important.

We could define a run as 6 (cluster around 1), or as 600 trials (cluster around 100). The former gives us too few degrees of freedom. The distribution will cluster around 1, but since the numbers in the vicinity are 0, 1, and 2, it will be hard to see whether the actual experimental distribution fits the curve or not. 600 (clustered around 100) will let the data vary a lot more smoothly, but would probably be onerous for the participant. So let's say 60.

But we need more than one run. We might end up doing a large number of total trials, but they could be, say, 20 runs of 60 trials each. That gives us 20 data points for this participant. Then we try to fit the experimental data to a standard normal distribution with a mean of 10.

In statistics the phrases "normal distribution" and "standard normal distribution" have meanings different from your usage - see Wikipedia. Your example(s) would be expected to fit binomial distributions (Bernoulli for n = 1), perhaps approximated by Poisson distributions. [/pedant]

Startz · Nov 2, 2022

KAJ said:
In statistics the phrases "normal distribution" and "standard normal distribution" have meanings different from your usage - see Wikipedia. Your example(s) would be expected to fit binomial distributions (Bernoulli for n = 1), perhaps approximated by Poisson distributions. [/pedant]

But an average of the number of successful guesses is approximately normal if we are talking about a reasonable sized sample.

KAJ · Nov 2, 2022

Startz said:
But an average of the number of successful guesses is approximately normal if we are talking about a reasonable sized sample.

Approximately Wikipedia. But the binomial is easy to use and exact for the null hypothesis. It is very inappropriate to use an approximate distribution to test a null hypothesis when the exact distribution is available.

JayUtah · Nov 4, 2022

KAJ said:
Your example(s) would be expected to fit binomial distributions (Bernoulli for n = 1), perhaps approximated by Poisson distributions. [/pedant]

KAJ said:
Approximately Wikipedia. But the binomial is easy to use and exact for the null hypothesis. It is very inappropriate to use an approximate distribution to test a null hypothesis when the exact distribution is available.

Yes, thanks, you are correct: the example must use a discrete distribution. This is not actually an example I invented, but my recollection of an actual statistical model used in telekinesis research. I tried to find the original paper, but it's been too long. I thought it Jeffers' double-slit experiments, but that model is wildly different. The original paper did use a normal distribution, and now I recall that it's because the authors performed a normalizing transformation on the run outcomes that resulted in a real-valued set of means. That's not the case here: the means are countable data.

For a single run we can use the binomial distribution. Here n is somewhat large, but p is not especially small, so we should probably not look to any of the convergent behavior in any of the discrete distributions. If the experiment consisted only of a single run, we could indeed use the binomial test and reason about the statistical significance directly in that fashion. However, the practical design of the original experiment broke it up into multiple runs for various sound methodology reasons. It also had the side-effect of allowing for greater variance, albeit at the cost of requiring a second step to the model for aggregating the results.

My set of means is still discrete values, since it is count data. Ideally I should tweak the number of trials in the individual runs to give a suitably large λ at this stage for the Poisson distribution, allowing us greater variance. This is the statistical advantage. But it requires each run to have the same n, which was not the case in the original study I'm (badly) abstracting to make a point. In the original study the normalization step was required because run size was governed by limits on the protocol, apparatus, and participant factors, and needed to vary.

By the way, your correction was not pedantic. It was an elementary error that you caught and corrected. Thank you for pointing it out. The lesson is that this is how science works. We put something out there, and our peers correct us. A conscientious scientist doesn't argue with the plainly-evident facts. A conscientious scientist is concerned not with ego or personal reputation, but solely with getting the facts right -- and in this case, getting the protocol and model right. A conscientious scientist doesn't try to argue in vain around the criticism, or make excuses aimed at trying "somehow" to still be right. I was wrong and you were right, and you spoke up.

LongFuzzy · Nov 4, 2022

JayUtah said:
I was wrong and you were right, and you spoke up.

Does he get a T-shirt?

JayUtah · Nov 4, 2022

LongFuzzy said:
Does he get a T-shirt?

For obvious reasons, I don't have a say in who gets one.

Startz · Nov 4, 2022

KAJ is certainly correct, and since it's just as easy to use the binomial why not? And you both provide an example or a constructive scientific discussion that might serve as a useful model to others.

As a note, if the probability of a hit is 25% and we run 60 trials, then according to the binomial we expect that 95% of the time we'll see between 9 and 22 hits. If we used the normal approximation, we would have said between 8.42 and 21.57 hits.

KAJ · Nov 4, 2022

JayUtah said:
<snip> I was wrong and you were right, and you spoke up.

Graciously said, thanks!

Michel H · Dec 1, 2022

I remind you of my latest test:

Michel H said:
I recently wrote (and circled), on a piece of paper beside me, one of the following four sentences:
(1) Russia withdraws from conquered territories since February.
(2) Crimea and the people's republics belong to Russia.
(3) Ukraine doesn't join NATO.
(4) Sanctions are lifted, trade resumes normally.

I ask you to tell me which of these four statements I wrote.

It was selected by means of this random number generator: https://www.random.org/integers/, all four texts have equal probabilities.

which hasn't received a single answer so far.

A nearly identical test on Spiritual Forums: https://www.spiritualforums.com/vb/showthread.php?t=145202 has already received 3 interesting answers.

Three interesting answers in a similar test (but without the four sentences) have also been given on Psience Quest, a specialized forum devoted to Parapsychology: https://psiencequest.net/forums/thread-a-simple-telepathy-test-which-number-did-i-write.

The first response was given by Ninshub, one of the administrators of this forum, a honor for me.

By participating to this test, you can contribute, not only to Science, but also perhaps to Peace in the world, and, as you know, we live in a time when peace is very much needed.

steenkh · Dec 1, 2022

There is no science here whatsoever. You have proved this quite clearly with the methodology and the treatment of the results.

junkshop · Dec 1, 2022

Michel H said:
I remind you of my latest test:

which hasn't received a single answer so far...

Shocking news. It's almost like everyone has had enough of your repetitive nonsense.

Michel H said:
...Three interesting answers in a similar test (but without the four sentences) have also been given on Psience Quest, a specialized forum devoted to Parapsychology: https://psiencequest.net/forums/thread-a-simple-telepathy-test-which-number-did-i-write.

The first response was given by Ninshub, one of the administrators of this forum, a honor for me...

That's nice for you. Does Ninshub's role as an administrator lend more weight to their answer?

Michel H said:
...By participating to this test, you can contribute, not only to Science, but also perhaps to Peace in the world, and, as you know, we live in a time when peace is very much needed.

Bollocks.

Michel H · Dec 2, 2022

junkshop said:
That's nice for you. Does Ninshub's role as an administrator lend more weight to their answer?

Possibly, yes.

I remember that one of the admins of this forum (I mean, Psience Quest) has already given a good answer to one of my tests in the past (however, I don't remember if this admin was Ninshub or another one).

And you certainly remember the famous answer given by one of the mods of this forum (when it was still called the forum of the James Randi Educational Foundation, and was less radically skeptical).

On this forum, arthwollipot, who gave a correct answer to a similar test a few months ago, has also had an administration job on another forum.

arthwollipot · Dec 2, 2022

Michel H said:
On this forum, arthwollipot, who gave a correct answer to a similar test a few months ago, has also had an administration job on another forum.

I repudiate that answer and refuse to participate in any further so-called "tests" until you start following proper scientific procedure.

Michel H · Dec 2, 2022

arthwollipot said:
I repudiate that answer and refuse to participate in any further so-called "tests" until you start following proper scientific procedure.

This reminds me of somebody (a famous mod).

junkshop · Dec 2, 2022

junkshop said:
...Does Ninshub's role as an administrator lend more weight to their answer?

Michel H said:
Possibly, yes...

How so?

arthwollipot · Dec 2, 2022

Michel H said:
This reminds me of somebody (a famous mod).

Whose name you keep dragging through the mud in an attempt to score points, yes.

Pixel42 · Dec 2, 2022

Michel H said:
I remind you of my latest test:

which hasn't received a single answer so far.

I remind you of the reason for that: because it stopped being funny when we realised you could not detect sarcasm.

Only one of your tests was taken remotely seriously here, the one where we persuaded you to use a protocol that met the minimum standards of the scientific method. All the others were jokes.

Dave Rogers · Dec 2, 2022

Michel H said:
I remind you of my latest test:

which hasn't received a single answer so far.

So you're claiming 100% of responses were correct, right?

Dave

kali1137 · Dec 2, 2022

At this point everyone sees through your "tests". There is nothing to be gained here for you or any of us. I would recommend you stop unless you become willing/capable of an honest test.

New telepathy test, the sequel.

Penultimate Amazing

Scholar

Muse

Scholar

Penultimate Amazing

Critical Thinker

Penultimate Amazing

Muse

Scholar

Banned

Philosopher

Otto's Favourite

Banned

Limerick Purist Pronouns: He/Him

Banned

Otto's Favourite

Limerick Purist Pronouns: He/Him

Schrödinger's cat

Bandaged ice that stampedes inexpensively through

Illuminator