Moderated Is the Telekinesis Real?

All right, let’s get into the fundamental errors behind Buddha’s treatment of the PEAR research. Or rather, behind his dismissals of PEAR’s critics and traitorous former collaborators. We’ll start with a basic example of statistical modeling and work our way out to the Jeffers papers in later posts. Don’t worry, the math isn’t scary and the concepts aren’t hard.

Statistics is the algebra of uncertainty. Even processes we trust to be stable and predictable have outcomes we can see vary a little bit. Some reasons for variance depend on the nature of the process, but every process will vary simply because of the mathematical circumstances of how we measure it. Statistics provides us with the tools to reason rationally and defensibly in the face of this unavoidable uncertainty. Uncertainty in observation is most acutely felt in the empirical sciences, where we must assure ourselves that the part of the process we are intentionally varying (the independent variable) is really what’s causing a visible change in the thing we measure (the dependent variable), and that change isn’t due to all the other independent variables that change willy-nilly and also affect the process (the confounding variables, or just “confounds”).

To treat the outcome of a process statistically, it must be quantifiable and reasonably predictable from theory. The latter implies that it is also reasonably stable, or possibly metastable. The phrase “predictable from theory” is fraught with detail, so we’ll hit it from a few different angles in some introductory posts before we tackle Stanley Jeffers’ single-slit and double-slit experiments in later posts. Along the way we’ll develop the concept of what it means for a process to be random yet predictable, and introduce the concept of a constitutive relationship to create random variables from practically any quantifiable behavior.

We start, as so many introductions to statistics do, with the simple coin toss.

The process is familiar: a flat round disc of negligible thickness is thrown into the air sufficiently high and with sufficiently exciting long- or middle-axis rotation as to make its landing position practically impossible to predict in real time. The prominent surfaces are embossed with distinct graphics so as to determine which side has landed face up, generally demarcated as “heads” or “tails.” The outcome of the process represents a binary variable (a categorical variable with two categories).

Okay, we can codify the outcome into category simply by observing whether it has landed heads-up or tails-up. Most importantly, the regularized geometry of the object and its dynamic motion assure us intuitively that the coin has an equal probability of landing in either position given nominal starting conditions. We use coins in introductory examples because “predictable from theory” is straightforward here. There just isn’t a lot of theoretical complexity to a coin. But we have to be pedantic here to show that there is theory, because later examples require us to treat “predictable from theory” with greater rigor, and intuition is much less instructive in those cases.

But how is the outcome “reasonably predictable from theory” in a way that lets us treat it statistically for significance testing? First of all, it’s a binary variable, not a continuous variable. The outcome is either-or, not a continuous distribution of values. (That doesn’t rule out independence testing via the chi-square test, though. Nor would an ordinal categorical variable with many possible outcomes strictly rule out fitting collected data to a Poisson distribution, but we’ll get to that.)

Second and most important, it’s not at all predictable. We toss coins to obtain a random value to make binary decisions. The whole point is that the outcome of a coin toss is unpredictable. Well, yes, the outcome of one coin toss is both binary and unpredictable. But there’s a useful consequence of a process being trial-to-trial independent (like all truly random events must be) and having a determinate probability assigned to each possible outcome (i.e., in this case an equal chance of heads or tails). It’s that we can then expect -- and therefore predict -- certain behavior over the long term. That is, if a coin has equal chances of coming up heads or tails, then over many tosses we would expect the coin to come up heads the same number of times as tails. This is known as the Law of Large Numbers, and it describes the behavior of a stochastic (i.e., random) process in the limit. N.B. this is a concept in probability theory, keenly studied in mathematical statistics, or the kind of statistics Buddha says is necessary to understand his criticism of PEAR’s critics. Be that as it may, looking at the counts of different kinds of occurrences is the bread in the bread-and-butter of frequentist statistical modeling. Buddha’s explanations ignore it entirely, even though the authors he criticizes explain their models carefully. In the double-slit paper, for example, the dependent variable is named in the first sentence of the abstract.

But of course the real world doesn’t work in the limit. It works in practical or happenstance numbers of trials, and confounding variables. Thus the normal distribution happens only in theory, and only for some kinds of processes. In practice, for small numbers of coin-tosses, you may observe more heads than tails or vice versa. In fact, for odd numbers of consecutive tosses, you must have more of one than the other. That’s what I said above about the mathematics associated with measurement necessitating some variance from expectation. Where N=1, the coin came up either heads or tails that one time, and that’s just an inescapable fact of how numbers work. The term “shot noise” refers to the set of phenomena that occur when the observation is governed by small numbers, not large numbers. We’ll come back to this concept in a future post.

Another way of saying “The number of heads should equal the number of tails” is “The number of heads minus the number of tails is zero,” or

B = Nheads - Ntails

Over a large number of tosses, B should hover around zero. Initially, on the first toss, B cannot be zero. Nor can B = 3. B must be 1. On the second toss, however, B can either be 0 or 2. It cannot be 1. A more elaborate version of this principle is how Robert Jahn was able to recognize anomalies in his baseline data, and how Jeffers was able to point out its effect on the final results. As the number of tosses, N, increases, the number of values B can take on increases, because it accumulates the experience of prior tosses in a way that the individual trials cannot. But as we’ll see, the problem never goes away completely.

Your task now is this: every day for the next year, you are to toss a coin 10 times, compute B, and write it down.

At the end of the non-leap year we have a series of 365 values for B, one for each day’s run of N tosses, or trials. What does theory tell us we can expect those values to look like? The essential nature of the coin tells us that the most common value for B should be zero. It should come up an equal number of heads or tails most of the time. This is what we can “predict from theory” about B. The next most likely number should be +1 or -1, right? Well, sure, because we should expect at most only slight variance from theory, not gross variance, and that’s the smallest non-zero value.

Screeeeeeeetch! goes the record. In this case no, because if I toss the coin ten times, and B is zero, that means Nheads = Ntails, Nheads = N - Ntails, and N is 10. The only solution to that system is Nheads = Ntails = 5. The relationship Nheads = N - Ntails must still hold for all computations of B, but that’s just a consequent of the model we set up. So if Nheads = 6, Nheads must be 4. B cannot be +1 or -1, but it can be +2 or -2. Moreover, if we have an odd N, then B can never be zero because Nheads could never equal Ntails.

All right, all right. So in our model B has to be even. And |B| can’t be greater than N. Nheads = 0 and Ntails = 10 is the extent of B. B can’t be +12 or -12. Or another way of saying that is the probability of |B| = 12, 13, 14,... is zero. Statisticians might write that as P(|B|>N) = 0. We see all kinds of similar notation in the standardized language of significance testing and other statistical pursuits. But -- in notation -- what can we say about P(B=0)? Above we agreed it should be the most common case. The next most common case is where only one toss was out of place, written |B|=2. P(|B|=2) isn’t as much as P(B=0), but it’s certainly greater than zero. P(|B|=10), i.e., all tosses came up either heads or tails, is not out of the question, but it definitely would be so remarkable that you’d tell people at work about it.

Now all the people who have studied statistics are shifting uncomfortably in their seats because in their minds they’re all saying, “JayUtah, all you’re just belaboring the construction of a histogram. Get on with it!”

Well, yes. If you have a bar graph where the x-axis is the value of B, from -10 to +10, and the y-axis is the number of days your coin-tossing session produced that value of B, then your bar graph would look like a really pixelated, spaced-out bell-shaped curve. You’d have holes for the B-values that are odd numbers, of course, since it’s not possible for B to take on those values where N is even. But where B does have a value, the values are bell-shaped across that domain. That is, the B-values are bell-shapedly distributed across that year’s worth of daily trials. It is, in fact, a binomial distribution. And since our outcomes occur with equal probability, It will approach the normal distribution in the limit where N approaches infinity. For the statistics nerds, it’s N(np,np(1-p)), where p = 0.5, n→∞. N.B. Don’t be confused by other presentations that seem to take the opposite approach by saying the normal distribution can be used to approximate the binomial distribution. They’re really just saying the same thing from two different perspectives.

Wow, JayUtah! You’ve done it! You’ve managed to derive a properly-distributed value from a binary-valued process! You’re the greatest statistician ever!

Well, no. There are holes in the histogram. Also, unbeknownst to you, I told AleCcowaN to do the same thing for a year, only with 1000 tosses per day instead of only ten. (And really it wasn’t even knownst to him.) His B-values aren’t even in the same ballpark as yours because his N was higher and allowed for it. Being off by 5 over 1000 tosses isn’t the same magnitude as being off by 5 over 10 tosses.

So two problems to fix: make our B-value independent of N, and fill the holes in the histogram that will arise depending on whether N is even or odd. The latter problem -- bin sizes for histograms of continuously-valued data -- has a whole set of acceptable solutions to it, but maybe we’ll cover those later. For now we’ll just consider a bin, for histogrammatical purposes on the non-zero numbers, to be the sum of adjacent even-odd pairs of B-values, knowing that one of them will always be zero.

A moment’s thought presents the solution to the scale problem. We are really interested in the fraction of tosses that weren’t perfectly equal, not the absolute number. Nearly every practical observation has the normalization problem, but it’s almost always easily solved. We revise the definition of B as follows

B = ( Nheads - Ntails ) / N.​

At least this allows us to directly compare values obtained over the year from daily runs using different N. But there still arises the problem that your B-values can change only in increments of 0.1, if N=10. Alec’s can only change in increments of 0.001. His variance will look different than yours, for the identical underlying process. This is an inescapable consequence of N-values, which is why they must be chosen carefully. Again, a more sophisticated version of this phenomenon is how Jeffers knows the PEAR baselines are fishy.

N.B. The customary way to present the coin-toss frequency example is simply to count the number of heads, which leads to the mean hovering around N/2 and all the histogram bins filled. I’ve added the slight complication to illustrate normalization step for different counts and to show how the concept of histogram bins helps us see where Jeffers was pointing with regard to Jahn’s baseline-bind problem.

Here are the descriptive statistics of a simulated coin toss once a day for a year with N=10

Mean|-0.007671232876712
Standard Error|0.016510804273099
Mode|0
Median|0
First Quartile|-0.2
Third Quartile|0.2
Variance|0.099501430076773
Standard Deviation|0.315438472727682
Kurtosis|-0.300957931860189
Skewness|-0.050546326867519
Range|1.6
Minimum|-0.8
Maximum|0.8
Sum|-2.8
Count|365

...and the same for N=1000.

Mean|-0.000997260273973
Standard Error|0.001765810490169
Mode|0
Median|0
First Quartile|-0.024
Third Quartile|0.022
Variance|0.001138101640825
Standard Deviation|0.033735762046009
Kurtosis|0.302641468096883
Skewness|0.004261861849087
Range|0.208
Minimum|-0.106
Maximum|0.102
Sum|-0.364
Count|365

If you know your way around the standard set of descriptive statistics, there are some eye-popping differences from what it supposedly the same underlying process. Those differences determine how conclusive we can be later on, and how useful as baselines these distributions can be. Look at the high-order moments and see how lopsided the coarse distribution is, and how the difference between the mean and variance behave. That’s a consequence only of the coarseness of experimentation and measurement. There’s nothing inherently wrong with the underlying physical processes.

So what can we do with this?

Well, we have a normal-ish distribution for a value derived from a random binary process. Once you have that constitutive relationship for any process of any type, you can do anything with that dependent variable. Deriving the constitutive relationship is what is often hard. Easy in this case, easy in Jeffers’ analysis, but hard in others that I’ll hopefully get to.

What we can do in this thread, obviously enough, is try to affect the coin-toss psychokinetically. For the upcoming year you’re going to do the same as before (except we’ll up the number of tosses to N=1000) and do the coin-toss regimen, only then each day you’re going to do it again and this time think very hard about making some of those coin tosses come up tails when they should have -- all other things being equal -- come up heads. That’s our dependent variable -- that second series of B-values where you tried to force tails.

But what does that mean -- pun intended. The mean value of B over a year should be very close to zero. It will be zero only in the limit as M (the number of days you conduct trials, not the number of tosses each day), but its absolute value should be tiny. But if you can preferentially make it come up either heads or tails using your mind, then what happens to the aggregation of all those B-values over a year? If we compute the descriptive statistics for the mentally-tainted B-values, we may discover that -1 or -2 came up a little more often than the normal daily random process would have given us. And that biases the mean B-value toward the negative.

Now, gee, if only there were some sort of test that lets us compare two empirically-obtained samples from a phenomenon that is Gaussian in the limit!

If I tweak the simulated coin toss so that once in every 20,000 tosses the coin comes up heads -- when by all other account it should have come up tails -- I can simulate the effect of a weak PK influence. We can then perform a t-test on the baseline B-values over a year against the tainted B-values over that same year. (While the t-test is allowable here, there are actually better significance tests for this particular example.) I get a one-tailed p-value of 0.159, which is unfortunately well above my alpha of 0.05. Apparently I’m not psychokinetic. How about 1 in 10,000 coin tosses, i.e., doubling my PK fu? Yep, my one-tailed p-value is 0.0416, or less than 0.05. Yay! I’m now psychokinetic!

There you go. The underlying process produces binary values. But it’s a binary variable that follows a predictable pattern, so we can frame it in terms of quantified departure from that pattern, and that becomes something that approaches a normal distribution and talk about statistically. And in a broader sense, this is generally how we can approach any process that is predictable according to any conceivable pattern. If we can quantify the pattern from theory, measure departures from the pattern, and measure the frequency and amplitude of those departures , we have the basis for modeling the pattern using a random variable.
 
In the previous post we learned how to create a usefully-distributed random variable from a binary variable with a predictable long-term behavior. Let’s look at a slightly more complicated example of “predictable from theory.” This has nothing to do with psychokinesis, but bear with me.

A launch vehicle of any useful size generally has to be fed with electricity, topped off with propellants and other fluid expendables, and connected to ground computers until launch. This is accomplished universally by means of umbilicals that must be deadfaced and disconnected often within a very short time after the vehicle lifts off, but must stay connected reliably while it’s still on the pad. In rocketry’s early days this was almost trivial, as the lift of the rocket was more than sufficient to supply the necessary pull-off force, and umbilicals could be oriented with their connection axis vertical to facilitate this mechanically. As rockets grew larger, the overall area needed to accommodate all the connections made it necessary to move the interface to the side of the rocket. Higher pressure for delivered fluids and leakage limits meant the extraction forces for these umbilical interface assemblies reached upwards of 200 lbf (ca. 900 N) for the Saturn V’s S-IC main launch umbilical.

The functional requirements for these designs are unforgiving. Upon liftoff, the umbilical must disconnect, otherwise loss of vehicle, mission, and launch facilities (the trifecta of a Bad Day in aerospace) is a significant risk. Conversely until liftoff the umbilical must not disconnect, else spillage of hazardous materials, electrical arcing and other day-ruining events are a significant risk. Various mechanical methods of holding the umbilical interface surely in place have been employed, such as cam-and-groove couplings. Various methods of ensuring extraction have been tried, such as powerful springs and gas ejectors for active extraction. These mechanisms are often tripped by a lanyard that pulls taut after the rocket has risen a prescribed amount, as well as by control signal from the ground launch sequencer.

Most notably -- especially for human-rated spaceflight -- the performance of the umbilical interface in either role must be engineered to a specified level of reliability, generally expressed as the probability of success for any given trial. A probability of success p > 0.9999 is referred to informally in the industry as “four nines” of reliability.

Engineers, of course, don’t learn statistics or use them.

Let’s ignore the hold-on force for now. While there are some amusing stories from the field about the performance of cam-and-groove couplings, I want to get the points across before everyone’s eyes glaze over.

The pull-off force can be estimated from theory. The theory in this case is the observation that most elements of an umbilical interface are close-fitting cylindrical sleeves. Fluid transfer ducts work this way, as do electrical pins and sockets. Extraction is a matter of pulling a cylinder from its annular sleeve against the friction imposed by the inner surface of the sleeve and the outer surface of the pipe or pin. Elementary material science and basic geometry provides a basis for first-order estimation. The overall contact surface area for each sleeve can be easily computed from measurements obtained from the design documents. The coefficients of friction can be found in manuals provided by the material suppliers. Estimates of normal force to help reckon friction can be deduced from the fit tolerances, especially for the fluid connectors. Summing these up gives us the nominal pull-off force.

But this is engineering, so of course we’re going to measure it.

You can imagine the test apparatus, with the fluid lines all pressurized and a tensiometer attached to the extractor. At the signal, the services are deadfaced and the giant umbilical interface plate is yanked forcefully away by a spring-loaded ram. And all the test engineers crowd around the laptop to see what the peak or breakaway pull-off force was. Do you think they’re going to stop at the one test? No, there may be lives at stake. So they’ll spend the rest of the day reconnecting the umbilical, repressurizing and re-energizing the connections, and yanking it off as many times as budget allows for. At the end of the day they have a collection of varying pull-off force values, F, that should cluster around a mean and be roughly bell-shaped, to account for effects that their predictive model may not have addressed.

Bad news. The mean pull-off force is 600 N, and the gas-charge actuator that actually pulls the umbilical on the launch pad can pull with only 580 N, plus-or-minus 10 N at a 99% confidence interval. That means a number of launch attempts (determined by the standard deviation of the test data) will fail because the high-end tail of the test trial distribution falls outside the envelope of the extraction mechanism. Four nines of reliability means the envelope provided by the actuator has to almost completely encompass the entire distribution of measured values.

(Here we have to step fully inside the realm of engineering and note that achieving four nines of reliability would almost certainly involve a giant margin on the gas-charge actuator, redundancy, and a backup extractor that works an entirely different way. This is an example of a certain kind of statistical reasoning, and so I’ve taken great liberties with the principles of engineering that I’m using to contrive the example.)

We can’t change the actuator in this case, because it’s been flight-qualified and is certified to its own required success rate. The force with which it operates may vary as we described, but the number of times it fires when signalled is the design driver here. It always fires, and we’ll just stipulate that a more powerful one wouldn’t go off as reliably when signalled.

And curiously, the estimation document gave a pull-off force of 540 N. Is it wrong? Should we be worried?

Maybe not. As we noted, it’s a first-order estimation. The constitutive relationship we envisioned for it was limited largely to a simplistic friction model based on elementary Euclidean geometry and the high-school level model of static friction. Good enough to get in the ballpark, we think. So if the measured force differs, that’s to be expected because second-order effects will be present in the actual trials. This is why the most useful statistical analysis is done against empirically determined baselines whenever possible.

We might write the dependent variable’s constitituve relationship estimate as

F = ∑ ( d π L T μ )
Where d is the mating diameter of the fitting,
L is the distance it is sleeved,
T is the “tightness” of the fit, as a normal force per unit area, and
μ is the coefficient of friction for the pair of materials making up the connection.

This is naturally summed for all connection elements using the specific values that pertain to each fitting.

If we wanted to explore this constitutive relationship a bit more, we could note such second-order effects as tightness, T, being a function of temperature and of the coefficients of thermal expansion. As parts of the fitting expand and contract, the force required to separate them changes. Every plumber knows this; if it doesn’t come loose, heat it up with the torch. T is also possibly directly affected by the pressure of the fluids it contains, if it is a fluid-bearing pipe. The higher the pressure in the fitting, the more tightly it will fit in its sleeve, even if it’s made of metal. And then the things that affect temperature -- whether the fluid is cryogenic, or wether the ambient temperature dominates. How much heat transfers through the umbilical interface structure itself to adjacent fittings? And that’s just the fluid ducts. What about the electrical connections? What if a high-current pin has shifted in its socket due to vibration and has arced and partially welded itself inside its socket? How do we even model that possibility?

This is why a dependent variable that’s “predictable from theory” is sometimes hard, and it’s also why “theory” in many cases comes from repeated observation under conditions that are as controlled as possible. It’s also a great way to start fistfights between frequentists and Bayesians.

An earlier draft of this post went into more detail about tightness being a function of several tertiary factors, and how each of those then becomes a parameter in the predictive model. But then my eyes started glazing over. Not because it’s intractable, but because the detail is ponderous and off-putting. The handwaving summary is that the constitutive relationship

F = { an unbelievable amount of interrelated effects }

then becomes a likelihood function. We learned all about those when trying to correct Jabba and his proof for immortality. It tells us that the expected pull-out force is a probability distribution -- maybe normally distributed or maybe not. But one thing it isn’t is a single-valued number. The first-order friction estimates are still buried down there, but surrounded by a bunch of random variables that we’re using to model the parameters we just discussed. Expected temperature range (ambient or incidental) goes in there, fluid pressure, likelihood of arcing and maximum weld tearout force. All these are not single-valued variables, but probability density functions themselves over different parameters. Each changes the shape of the resulting distribution for F. And the fact alone of F going from a single-valued estimate to a likelihood function in several parameters is what I’m trying to show in this post: “predictable from theory” can be where smart people earn big bucks, because it can be hard.

But while we’ve been fretting over likelihood functions, arced wires, and cryogenic feedlines, the design engineers have been quietly trying to solve the problem more practically. They’re going to anodize the contact surfaces on the big fluid connectors with a material that has a much lower coefficient of friction. With the new coefficients of friction in place and all the crap we contemplated a few paragraphs ago relegated back to the back burner of second-order effect, our friction-only model gives us the promise of reducing the pull-out force by 25% to 405 N. We should expect an empirically measured value to reduce by the same amount, if we’ve been right about what’s a first-order effect.

Back to the test stand we go, only to our chagrin our new mean pull-out force, over several trials, is 585 N, nowhere close to the improvement we expected from theory.

Statistically, what should be done now? Well, we could do a significance test to determine whether this new version of the dependent variable (pull-out force with anodized couplings) is a significant improvement over the old one. We treat the non-anodized-coupler data as the baseline and the anodized-coupler data as the variable and see what shakes out. I didn’t simulate any actual numbers for this, so I can’t provide as detailed an answer for this example. But informally it comes down to the standard deviations of the two series of tests, and how many tests were done in each series. (Remember how the descriptive statistics were so very different for different N-values in the coin-toss example?) From a business perspective, unless the mean pull-out force is statistically significant, there’s no point in recommending doing it. It didn’t solve our main problem, and it would only complicate the manufacturing process.

Yeah, what about that main problem? Can statistics help us here? Very much so! What the statistical comparison of test data tells us is that our understanding of the mechanical behavior of the assembly here is deficient. That conceptual understanding is what we used to develop the constitutive relationship, and that’s supposed to quantify and predict the behavior of the actual machine. In this case, as opposed to the coin-toss, the process variable we’re interested in is directly and continuously quantified (force), but the model is missing something. Yes, we were able to affect one first-order factor in a way suggested by our concept, but there’s clearly another first-order effect.

Then in comes the old Apollo engineer.

“The fluid hoses on your test stand are too short,” he says. Puzzled expressions on the faces of the younger generation.

“I don’t understand,” one of them ventures. “Do you mean they’re somehow adding to the extraction force or something?”

“No, no, nothing like that,” says the grizzled veteran. “Here, watch the video.” He pulls up the old pad cameras of the Saturn V launches. “See how long those hoses are? See how they hang down?” The group nods, still not getting the point. “How much do you think those hoses weigh, full of liquid propellants?”

“Dunno, maybe a thousand kilograms? I’m sure we can estimate it or compute it.”

“I’m sure you can, but did you factor that into your model? Do the expectations for your dependent variable include it? Did you measure it in your tests as opposed to what it’s going to be like on the pad?” Sheepish silence follows. “In your tests, you thought you were changing the only first-order independent variable out there. It didn’t have as big an effect as you estimated because your estimate leaves out an important first-order independent variable, and that’s not something that changed when you anodized the fittings.”

“I’m still not sure what’s happening,” admits one of the youngsters.

“What happens when you have a cylindrical sleeve coupling with a horizontal axis,” explains the Apollo guy, “and you hang a weight from it such that there’s a force trying to misalign the coupling?”

“Oh, I know that. Misaligned sleeve couplings. They’re harder to extract.”

“Why?”

“Because the friction is concentrated at the upper and lower lips of the sleeve components, not evenly distributed around the contact area, for one thing. But more importantly, the normal force that determines friction is a function of the suspended weight as well as the tightness of the fitting. That weight could be orders of magnitude more than the fit tolerances or the internal pressure.”

“That’s right,” confirms the Apollo guy. “It was missing from your conceptual model, missing in your constitutive relationship, and therefore confounded in your test data.”

“Good thing we did that second test run, then,” says the manager, hovering in the conference room doorway.

“No, you blithering idiot!” growls the Apollo veteran. “You should have been ordering up these kinds tests from the beginning.”

“To study what?” asks the manager in his defense. “We had a specific problem to solve, and a specific solution to try, and we devised a test to do that. It accidentally gave us some insight -- nothing more. We’re engineers on a mission, not curious scientists testing random hypotheses.”

“Wrong,” asserts the Apollo guy. “Every engineering design is a scientific hypothesis. It makes predictions about the behavior of the physical world that can be shown true or false by observation, in these cases by observing the exercise of the design. The model you formulated does more than just demonstrate the theoretical predictability of the outcomes of actual trials. It represents your knowledge of the problem you’re trying to solve. You should have been varying all those parameters and observing via test that the variations not only had an effect, but had the magnitude of effect your model predicted. You did that accidentally, but you should have been doing it on purpose, systematically, until your model parallelled the behavior of real-world testing. That’s how you get to four nines.”

This example shows how statistics is used in one of the fields Buddha told us didn’t need to delve very deeply, if at all, into it. It shows that expectations are themselves going to be distributions, not scalar numbers. It shows the connection between hypothesis testing and statistical modeling -- the two often come from the same set of assumptions and beliefs and can be tautological if not carefully reviewed. “Predictable from theory” can mean that if the predicted results don’t arise, the theory is wrong. But one has to put into place the proper controls to be able to draw that conclusion. We’ll examine that a little more in the next post that covers Jeffers’ single-slit apparatus.
 
Now let’s look at the two studies by Stanley Jeffers we’ve been talking about. First is the single-slit experiment. The behavior of light going through a single slit is reasonably straightforward. Due to a combination of physical effects, a beam of light shining through a narrow slit onto an observation plane will land in a diffraction pattern.

Traditionally the slit is oriented vertically and the band of light therefore appears vertical on the observation plane. Again traditionally, we cut through the band horizontally and measure the intensity of received light as a function of x-axis position across the observation plane, which I’ll hereafter call the sensor.

Buddha notes that the graph of a subset of the received intensity produces a bell-shaped curve, and all his misunderstanding of PEAR, John Palmer, and Stanley Jeffers stems from the leap he then makes. This, he insinuates, is the “distribution” that goes directly into the tests for significance.

It is not.

Well what is, then? We’ve done two previous examples of abstracting known, predictable, or derivable facts about process variables from various processes and formulating a constitutive relationship that gives us expected values as a random variable -- a quantity with a mean and expected variance. So that’s what we’re going to do now.

First, Buddha wrongly assumes the distribution of light across that horizontal slice is a Gaussian distribution. It can be, given certain physical care taken for the aperture, but ordinarily isn’t. But the distribution of light across the slice in general is governed not by Gaussian math, but by Fraunhofer math. Fraunhofer math, most notably, can produce diffraction patterns -- even from only one slit -- that exhibit the multinodal (i.e., multiple bands) we associate with the double-slit phenomenon. Buddha is sunk right there. He thinks the single-slit experiment is valid at least in principle because light going through a single slit produces a Gaussian distribution. In fact it doesn’t. So we must develop a constitutive relationship between the behavior of the apparatus and something we can treat as a tractable distribution.

As with the coin toss, that relationship in the single-slit experiment is the movement of some characteristic value over a number of independent trials. [Jeffers, S. and Sloan, J. “A low light level diffraction experiment for anomalies research.” Journal of Scientific Exploration. vol. 6 (1992) no. 4. pp. 333-352] This is the paper Buddha claims does not exist.

Jeffers computes the centroid of the distribution of light. The centroid is simply the x-value -- as in an (x,y) cartesian coordinate system -- of the place on the the sensor where the light is brightest. It’s almost always the x-value of the highest point on the measured-intensity curve. It is very analogous to the mean in a normal distribution, in that Jeffers computes it as the first moment of the transformed (i.e., cubed) intensity data. But the actual math is entirely different. And, frankly, the physics. Regardless, it’s a very intuitive concept. Over a hundred runs or so, where the light is repeatedly interrupted, the sensor reset, and the light restarted, the computed centroid will be expected to be at roughly the same x-value every time. But of course it isn’t. Sometimes it’s slightly deflected to the right, sometimes slightly to the left. But not a long way, of course. The greater the magnitude of deflection, the less likely the apparatus is to result in it. Except, as we see later, the unaffected apparatus did display a long-term drift over several hours of operation. This is part of "predictable from theory." Theory here includes various uncontrollable variables such as thermal cycling that will affect any sensitive apparatus over a long term.

The distribution of x-values of the position of the centroid over many trials is the dependent variable in the single-slit experiment.

So whither psychokinesis?

Robert Jahn theorized (you read that right, this was really Jahn’s experiment and not Jeffers’), that psychokinetic subjects could will the quantum mechanics phenomena underlying the diffraction effect to cause the centroids to shift left or right over a large number of trails, over and above any dispersion that would occur from the dalliance of the machine -- i.e., the uncontrollable independent variables.

Had his hypothesis proven correct, he should have seen the x-values for the computed positions of the centroids display a mean significantly different than the baseline runs. The distribution of experimental centroids should have been significantly displaced horizontally from the distribution of baseline centroids. Unfortunately for him, there was no significant shift in where the centroids landed.

So why all the crap about rocket umbilicals? Because it’s a more practical example of the kinds of work Jeffers had to do -- also reported in the paper -- to assure his readers that the result he was expecting really was “predictable from theory.” Just like the novice engineers who didn’t anticipate all the important variables, Jeffers and PEAR were entering new territory and didn’t have the luxury of decades of prior work. Much of their initial work was attempting to validate their models in terms of how much effect other known or suspected variables could have. And most of Jeffers’ paper on the single-slit experiment was about describing and validating the apparatus, not investigating PK effects on it.

The concept of the mean centroid position distributed bell-shapedly over numerous runs is the part Buddha was missing, and it’s really important. That was the dependent variable, not the intensity curve itself. It’s a huge mistake in understanding the paper.

But since we’re here, let’s skim over the rest of the paper. If you skip to the end, you read that the 20 subjects Jeffers used for this experiment were unable to get the average centroid position to move significantly. But the significance testing here is pretty wild, if only in the empirical controls Jeffers contrived to eliminate potential confounds.

One thing Jeffers looked at was auto-correlation, which is just a fancy word for naturally cyclical behavior in the dependent variable. You can think of it like a new card deck where all the cards are in order. Patterns of values and suits will show up when the cards are dealt, and some guy might get all the aces simply because of his position at the table. The concern in science is that cycles in the data might correlate to cycles in active and inactive periods of the protocol and give a false positive. Figures 6(a) and (b) [Ibid., pp. 342, 343] show an acceptable auto-correlation plot, formed by correllating the data to a version of itself offset by an increasing number of trials (“lag”).

But most fascinating is the Rube-Goldberg procedure whose description ends on page 347 with the understatement that it removes the effect of short-term natural drift in the calibration data (see Fig. 3, Ibid., p. 338). A careful reading of the whole protocol description reveals the strategy for implementing empirical controls during the actual sessions. The subject is not allowed to choose which direction to bias the centroid -- the problematic “volition” variable from Jahn’s work. The baseline trials are interspersed with experimental trials. The same protocol is used for all runs.

This is actually the more difficult of the two Jeffers papers to read, first because it begins with a lengthy survey of theoretical and conceptual issues for PK and quantum mechanics, and a survey of the prior efforts. And second because of the laborious description of the methods used to validate this apparatus as a proper REG for PK research. The double-slit paper is more profound in its import, but thankfully expressed more briefly.
 
This thread is such a Dumpster fire that we’re avoiding the interesting questions. The reason Jeffers (yes, and Jahn too) went on to the double-slit version of the experiment is because the double-slit phenomenon -- seen with particles -- is truly one of the WTF? moments in physics. Many PK researchers hope to plumb some of the depths of earlier speculation between quantum effects and consciousness that were advanced to explain the poorly-understood observation. They honestly believe there could be a connection that could have scientific merit at the level expected by the mainstream. In sum, there’s a legitimate mystique to the double-slit experiment -- paranormal factors aside -- but I’ll leave it to the reader to research it on his own. I recommend Feynman’s famous lecture.

The most visibly notable thing about the double-slit effect is the prominently multi-nodal interference pattern. It resembles generalized Fraunhofer diffraction, but is not the same thing. This is a true interference pattern of the kind that arises from superimposition of waveforms. The “nodes” are the dark parts and the “antinodes” are the light parts. The central antinode, of course, is brightest. The magnitude of the centroids of the subordinate antinodes fall off (reasonably symmetrically) with angular distance from the projection axis.

Buddha is immediately stuck because, while he’s able to make certain cases of Fraunhofer diffraction look bell-shaped enough for you to believe it’s a well-distributed dependent variable on its own, the interference pattern is clearly not -- and clearly cannot be -- one of those. His arguments ignore the basic indirection and abstraction that’s required of all statistical modeling -- and that’s beginner’s stuff.

By now we know the drill: find a constitutive relationship. What quantified properties of that interference pattern might we be able to measure, might exhibit variance over many trials that can be normalized, might be susceptible to low-level psychokinetic effects? Maybe significantly change the symmetry among the subordinate antinodes? Measure the centroids of the left and right flanking antinodes and see if a subject can make them differ significantly? Shift the whole interference pattern to the left or right, as was suggested with the single-slit version? Many possibilities here -- all fraught with their own problems in measurement, numerical behavior, and hypothesized susceptibility to PK influence.

Here’s what Jeffers decided upon. [Ibison, M. and Jeffers, S. “A double-slit diffraction experiment to investigate claims of consciousness-related anomalies.” Journal of Scientific Exploration. vol. 12 (1998), no. 4. pp. 543-550] He ignored the position of the centroids on the sensor and concentrated on relative intensity. In his model, the value he felt would vary appropriately is the first-order contrast [Ibid, p. 545]. That’s the difference in light intensity between the center antinode and the corresponding centroids of the nodes (troughs) to either side. That way the experiment could be left to let the light deposition wander all over the sensor without affecting the results. (i.e., he found a way that controlled for as many known variables as possible.) Only the normalized intensity of the brightest and darkest spots mattered, not where they were or what their absolute values were. He averaged the intensities of the nodes, the two dark spots to either side of the central antinode and applied other empirical controls which he describes on page 545. Each trial was a brief exposure of the sensor to light coming through the double slit. Given the drift he observed in the single-slit experiment and the headache it gave him, this is a marked improvement.

The protocol was for each subject, at each session, to do a series of runs composed of trials that interleaved “inactive” (i.e., control) exposures with “active” (i.e., experimental) exposures. To provide feedback to the subject, the computed contrast for was shown for each active trial as a bar of variable length. The subjects were allowed to affect the pattern, and see the results of their efforts, only during the active trials.

This experiment was done under Jeffers’ supervision at his university, and and also at Princeton, using the same apparatus, under Jahn’s supervision. Jeffers’ subjects were told to “imagine that during the active runs they could identify (by extra-sensory means) the path of the light beam near to the double slit” [Ibid., p. 545]. This, they were told, would result in less contrast between light and dark, which they could gauge roughly using the indicator. A session consisted of a series of 41 runs of 21 trials. Jahn’s subjects were told “that their primary task was to intend the analogue indicator bar to remain as low as possible.” This led to discussion among reviewers whether directing the desired outcome versus directing the prescribed method of achieving those ends could be a significant variable. Jahn’s series’ were composed only of 20 runs each. (Remember the effects we observed as we varied N in the coin-toss example? These numbers matter!) Calibration runs were performed before and after each session at both sites, and the empirically-determined variance from those was used to parameterize the subsequent significance tests.

Naturally, as we’ve seen, he got ordinary variance in contrast across the inactive-data runs, meaning the contrast varied predictably around a mean value. This formed a bell-shaped curve that could be approximated by an empirically parameterized normal distribution. Then he compared the means and variance of their active-data runs with the inactive-data runs and found no significant difference [Ibid., p 547]. Again, the means and variances used were mean contrast and variance in contrast, not means and variances somehow determined for the raw shape of the interference pattern.

Seen in this perspective, the Jeffers double-slit experiment is hardly any different from our other examples, or even very remarkable on its own. Stanley Jeffers’ papers on psychokinesis are easy reads, published in a journal with modest standards. I would expect a prodigious high-schooler to be able to figure out what the statistical model was, what was actually being studied. Buddha’s error at reading them is egregious and irredeemable. It doesn’t stop at merely being unable to read and understand a scientific paper in an unfamiliar field. It extends to not understanding -- at a fundamental level -- how scientific analysis is done in general. He’s not even to the point of being able to state the problem correctly. Since his own purported expertise was the basis he used for judging the critics of PEAR, there’s no reason to suppose that judgment is based on a competent foundation of knowledge, and therefore it should be rejected.

In the first part of this series I promised we’d come back to the notion of shot noise. This thread started with Buddha trying to explain how Robert Jahn’s experiment worked and how PEAR used it to test psychokinetic ability. He was correct in saying that at the core of Jahn’s random-event generator (REG) is an electronic-noise circuit. And he’s correct in alluding that such circuits work according to the principle that electricity is really a stream of discrete charge-carrying quanta, electrons. And he’s correct that a probabilistic process governs how many of those discrete electrons pass a fixed point in a fixed interval of time in the current through a circuit. As we explored in the coin-toss example, that number is guaranteed to vary from the mean, and can be measured finely. Commercial products exist that translate this phenomenon into useful signals such as voltage fluctuation.

The resulting graph of output from these circuits over time is not itself any kind of bell-shaped curve or classic distribution, obviously. It’s simply a chaotic-looking signal. If you sample it at regular intervals and take the mean value, that can be considered the mean of a distribution. The height of the peaks in the signal above the mean, and the depth of the valleys below the mean, and the frequencies with which these each occur are what is governed by probability. Very high peaks and very low valleys -- corresponding to probabilistic bursts and dearths of discrete electrons over a tiny interval -- are rare. They represent what the left and right tails of the distribution would suggest happens. In contrast, slight peak-or-valley excursions from the mean are common. They represent Z-scores much closer to the mean.

That’s the role the Poisson distribution plays in Jahn’s REG. It’s not the phenomenon that’s directly used in Jahn’s statistical treatment of his results. It’s not even the quantity in the signal his equipment measures. That’s like comparing apples to brake pads. We note that Buddha got the dependent variable wrong in the PEAR experiments too. He’s been wrong about every statistical model in this whole thread, in fact.

The measured voltage at any instant will be either above the mean or below it. The amount of time it remains in either state will vary at random around a mean value. In terms of electronics, it’s reasonably easy to detect when a voltage increases beyond a set value and decreases below it, which allows us to derive a reasonably discrete signal that’s simply “above” or “below.” And the length of those high-low pulses will vary randomly, but we can clip that and just consider a certain height to be the definitive crossing of the mean. Now gate (or sample) that signal with a short pulse at regular frequency, say 10 kHz. Each time your regular pulse fires, look to see whether your clipped signal is high or low (or positive or negative, if you’re thinking of voltage relative to the mean). That becomes a binary variable that occurs at the regular interval of your sample rate.

In other words, a coin toss.

Now over a broader interval -- say 10 Hz -- the count of positive versus negative bits from the REG should be roughly equal. You’ll have 100 samples, and the difference between the positive and negative samples should theoretically be zero. But we know from practice that it won’t be. And just like the coin-toss, the magnitude of the aggregate bias in a number of those 1/10-second samples taken in a series, adopt a bell-shaped curve centered on zero -- here, another binomial distribution. The underlying quantum behavior is Poissonesque in nature, but not the resulting analysis model we derived via a constitutive relationship to the binomial distribution.

That’s the dependent variable in the PEAR experiment. And it’s all explained in the article Buddha cited at the beginning of this thread. With pictures. [Jahn, Robert. “The persistent paradox of psychic phenomena: an engineering perspective.” Proceedings of the IEEE. vol. 70, no. 2 (February 1982). pp. 136-170] Buddha’s demonstrated unfamiliarity with scientific writing and basic statistical modeling extends to his own sources too. His analysis of Jahn and his dismissals of the critics of Jahn’s research is based on a concept of statistical modeling that is not even close to the way it’s actually done, or the way it was described in the source material. His claim to be expert enough that his judgment should be considered probative is therefore strongly rejected.
 
JayUtah said:
Be that as it may, looking at the counts of different kinds of occurrences is the bread in the bread-and-butter of frequentist statistical modeling. Buddha’s explanations ignore it entirely, even though the authors he criticizes explain their models carefully. In the double-slit paper, for example, the dependent variable is named in the first sentence of the abstract.

And that variable was mentioned many times by at least three of us, though generally in a concealed way so certain user would have an opportunity to show his proficiency and right his wrongs.

For instance, the last time I did it was in post #711

[I'll come back later with more comments on Jay's "primer" :D]
 
Jeffers wrote an article about his more recent experiment designed to prove that an object cannot affect the results of an empirical study without physically interacting with the equipment. Apparently, the Princeton research was one of his multiple targets, he wanted to prove that all claims that humans can mentally affect the results on an experiment are false. This time he used a single-slit diffraction for his research, which is a correct choice; he should have used this type of diffraction long time ago.

Jeffers, A Low Level Diffraction Experiment for Anomalies Research.

https://www.scientificexploration.org/docs/6/jse_06_4_jeffers.pdf

Jeffers explains the purpose of his experiment in the Introduction.

“A few experimental physicists have attempted empirical investigations in this
area (Hall, Kim, McElroy and Shimony 1977). Radin and Nelson (1989) have attempted
a meta-analysis of hundreds of experiments in this area published in a
wide variety of journals and have concluded "it is difficult to avoid the conclusion
that under certain circumstances, consciousness interacts with random physical
systems".
Jahn (1986) and others have suggested that the claimed effect exists at the level
of "information", i.e. it is the statistical distribution of possible outcomes from the
apparatus that is affected by the operators intentions and thus the claimed effect is
not seen as purely physical. In a similar vein, Eccles (1986) has suggested that intention
may influence neural events in the brain by analogy with the probability
fields of quantum mechanics. According to this view (the weak violation hypothesis
(Schmidt and Pantas 1972)), there is no violation of conservation laws of
physics. The claims advanced have been based on studies using a variety of experimental
techniques. If true, then any process governed by probabilistic laws should be
amenable to demonstrating the claimed effect. We have devised a simple optical
experiment based on the phenomenon of single slit diffraction to examine these
claims. This experiment comes closer to the alleged links with quantum mechanics Our experiment yields high data-rates in computer compatible form and has
been designed to meet various methodological criticisms.
The essential claim which has been advanced is that some human operators can
produce a statistically significant shift of the mean of a given distribution generally
in accord with intention. The other moments of the statistical distribution remain
unaffected. In our experiment, the relevant distribution is the digitally
recorded single slit diffraction pattern. This is recorded with high accuracy with a
short integration time. The centroid of the pattern is determined. Many repeated
measures allow for the study of their statistical distribution.” Jeffers

By choosing such broad target Jeffers proved nothing. Instead he should have chosen specific research programs that he criticized and proved on the individual basis that a research is question cannot be reproduced in his lab, as he tried before before when he concentrated his effort solely on Jahn’s research. The conditions of Jeffers’ experiment do not comply with the conditions of any other experiment that he chose to criticize, so his research is completely useless.

Rather than analyzing conditions of all experiments done by his scientific adversaries, I choose the conditions of Jahn’s experiment to show that Jeffers did not reproduce them correctly.

“This experiment has been performed for 20 human subjects chosen in the following
manner. An advertisement was placed on bulletin boards on campus inviting
participation in the experiment. When respondents called, they were told that
no physical effort would be required of them along with the relevant information
as to when, where, etc. There was no screening of applicants. All scheduling was
conducted on a first-come, first-served basis. When respondents arrived for their
appointed session, they were told the purpose of the experiment, then asked if they
were comfortable with the situation or had any questions before they began.” Jeffers.

The selection procedure for Jeffers’ experiment is nonrandom, while the one for Jahns’ experiment is random, which means that the randomness conditions essential to Jahn’s experiment was not duplicated in Jeffers’ experiment.

“This parameter does not show significant auto-correlation. A
histogram of the sums over residuals for the bins we designated as left or right together
with their auto-correlation functions is shown in Figure 7 and Figure 8.
We note that these measures are uncorrelated and approximately Gaussian distributed.” Jeffers

“Approximately Gaussian distributed” is not good enough, it should be Gaussian distributed as it was in Jahn’s experiment, because Jeffers based his analysis of experimental data on the assumption that his process is a normally distributed random work.

“A random walk having a step size that varies according to a normal distribution is used as a model for real-world time series data such as financial markets. The Black–Scholes formula for m
modeling option prices, for example, uses a Gaussian random walk as an underlying assumption.
Here, the step size is the inverse cumulative normal distribution where 0 ≤ z ≤ 1 is a uniformly distributed random number, and μ and σ are the mean and standard deviations of the normal distribution, respectively”.

https://en.wikipedia.org/wiki/Random_walk

Of course, Jeffers could have easily converted his approximately Gaussian distribution to truly Gaussian distribution if he had used a power transform (I briefly mentioned power transforms in my previous posts), but he did not do it, which means that his analysis of the data is incorrect and should not be trusted.

“The bins are then grouped in threes,
with the first bin designated as "no effort" and the second and third bins designated
as "left effort" and "right effort" with the order of "left effort" then "right effort"
or vice versa chosen randomly. We then analyze the bins in groups of four,
made up from one group of three bins as outlined above along with the first bin of
the next group of three, considered as a "no effort" bin. The first and fourth bin are
"calibration" bins and the second and third are "data" bins.” Jeffers.

This is an incorrect calibration procedure because the subjects were told the purpose of the experiment before the calibration, if I understood the procedure correctly (Jeffers didn’t say exactly what the subjects were told about the experiment before it started). It is incorrect because the knowledge affects the subject’s mental state and introduces a bias.

There are other inconsistencies in Jeffers’ setup, I might discuss them later, but now I do not have time for that.
 
Mean|-0.007671232876712
Standard Error|0.016510804273099
First Quartile|-0.2
Third Quartile|0.2
Variance|0.099501430076773
Standard Deviation|0.315438472727682
Kurtosis|-0.300957931860189
Skewness|-0.050546326867519
...and the same for N=1000.
Mean|-0.000997260273973
Standard Error|0.001765810490169
First Quartile|-0.024
Third Quartile|0.022
Variance|0.001138101640825
Standard Deviation|0.033735762046009
Kurtosis|0.302641468096883
Skewness|0.004261861849087
If you know your way around the standard set of descriptive statistics, there are some eye-popping differences from what it supposedly the same underlying process. Those differences determine how conclusive we can be later on, and how useful as baselines these distributions can be. Look at the high-order moments and see how lopsided the coarse distribution is, and how the difference between the mean and variance behave. That’s a consequence only of the coarseness of experimentation and measurement. There’s nothing inherently wrong with the underlying physical processes.

Some notes for those who are not familiar with numbers and some concepts.

First, the concept of "moment" in mathematics has nothing to do with time. It's a family of measures that take "shards" of "matter" (for instance, the smallest specks of an area or a volume) and make "calculations" to reflect the way they are "dispersed" around a point, a line or a plane.

A good, understandable example of it is an ice skater or a ballet dancer that rotates with their arms wide open and a leg extended, and that spinning speed suddenly increases when they pull their arms toward their bodies. The dancer has the same mass in both cases, but in the departure that mass is distributed as far as possible from the axis of rotation (the arms and one leg contributes to a high moment because those masses are far apart the axis). In this case, it's a second order moment because distances are raised to the second power (the power, the order of the moment). When the dancer pulls their arms and leg close to the axis of rotation, the spinning increases because this second order moment, called "moment of inertia" decreases as part of the dancer's mass comes closer to the axis. As energy must be conserved, the consequence is an increase of the rotational speed. [Our claimant surely knows a formula suggested by Wood's that uses another moment of inertia to calculate shear stress]

Moments are that important. In the distribution Jay was talking about, variance is the second order moment respective to the mean (it's the central second moment). It gives an idea of how disperse the distribution is, though it's easier to understand when the standard deviation is gotten from it. The standard deviation is like saying how far you should place together all the "bars in your histogram" to get the same moment as you get when it is all dispersed around. Skewness is a third order moment (deviations raised to the third power) and kurtosis is a fourth order moment (you guessed: raised to the forth power). Why they count: skewness measures the asymmetry of the distribution. In Jay's example, the distribution is a bit skewed to the left, in "my" example it is much less skewed, but this time, to the right. That explains the "lopsided" nature of the coarse distribution. It looks a little bit like Igor from the Young Frankenstein after his reconstructive surgery.

Now, about the numbers and the normalization that Jay described:

Back to those perfect, idealized coins which have a fifty-fifty probability in a single toss, with the 10 tosses a day, the table really says that you got through the year a mean of 4.924 heads (5 plus -.0076... x 10) but the real values were awfully disperse, with 5 heads the most repeated value, but lots of instances of 3, 4, 6 and 7 heads, and not at all a few of 2s and 9s.

In "my" case, the mean through the year was about 499 heads, and the distribution is not only looking more symmetrical but less disperse. Half of the results are between 476 and 522 heads (500 plus -.024 * 1000 and 500 plus .022 * 1000), and most of the rest are pretty close. Getting exactly 400 heads is a really rare event, not as much as winning the Loto but still most unlikely. Even getting 450 heads is difficult: it happens once in a blue moon (1/6000 probability).

This all goes to the size of a sample (365), the size of a run (10 or 1000) and which distribution is apt to estimate probabilities and thus define a confidence interval. Keep this in mind (I haven't read the next posts from Jay's yet, they are like fudge, to be savoured)
 
This time he used a single-slit diffraction for his research...

And do you recall that you recently claimed he did no such thing? The paper you are now attempting to review is something you vehemently claimed did not exist, even after several of your critics attested to have read and understood it. Why is it that you have not conceded your error? This is not the first time you have claimed something about Stanley Jeffers that turned out not to be true. Do you have an explanation for this pattern of denial?

As I predicted, you have ignored the presentation of the errors in your reasoning and have instead embarked upon an exposition of the research I reported on as if you are the one bringing it to the forum's attention. I have noted this as a recurring aspect of your presentation -- that you seem to co-opt the efforts of others and claim them as your own. <snip>


By choosing such broad target Jeffers proved nothing. Instead he should have chosen specific research programs...

Irrelevant. You are not an expert in the field Jeffers published in. Your opinion of what he should have done instead carries no evidentiary weight.

The conditions of Jeffers’ experiment do not comply with the conditions of any other experiment that he chose to criticize, so his research is completely useless.

Non sequitur. If someone criticizes previous work according to reasons he explains, it stands to reason he should take a different approach in his own research. As I have demonstrated, you are not an expert in this kind of research or the statistics that empowers it. Your opinion of its worth carries no evidentiary weight.

The selection procedure for Jeffers’ experiment is nonrandom, while the one for Jahns’ experiment is random

No, there are no facts from the paper that support this claim.

...which means that the randomness conditions essential to Jahn’s experiment was not duplicated in Jeffers’ experiment.

As I have demonstrated, you are not an expert in randomness. Your interpretation carries no evidentiary weight. There are two material differences between Jahn's protocol and Jeffers'. The first is the aforementioned difference in volition between the subject pools. The authors simply note that they hypothesized some effect that might occur as a result, but that effect was not studied. You are effectively claiming not only that there was an effect, but that the effect invalidated the Jeffers protocol. This is entirely an invention on your part. As noted in the papers themselves, Jeffers and PEAR collaborated on both the single- and double-slit research. You continue to err in your assumption that Jeffers set out to disprove PEAR, although all the facts with which you have been presented say otherwise.

The second difference is the fewer calibration runs in the Jahn protocol. This does not work in Jahn's favor.

“Approximately Gaussian distributed” is not good enough, it should be Gaussian distributed as it was in Jahn’s experiment...

This is pure hogwash.

As I demonstrated in Part 1 of my tutorial, the fewer subjects one chooses, the less faithfully one's empirical data approaches the Gaussian distribution. The PEAR version of the research had a smaller N-value than Jeffers' version. Perhaps you would like to tell the nice folks here how any data set with a finite N-value can be a Gaussian distribution without at least some error. You would do well to read the validation sections of the papers you have cited to discover how their authors dealt with this fact.

There is no measured process that is truly Gaussian. As I explained, processes are Gaussian only in the limit. As such, the processes can sometimes be approximated by properly parameterized Gaussians, as I did with the binomial distribution of the coin toss. But the Gaussian is a theoretical ideal, not a practical standard. You continue to display a misunderstanding of descriptive statistics at the fundamental, conceptual level.

Of course, Jeffers could have easily converted his approximately Gaussian distribution to truly Gaussian distribution if he had used a power transform...

No, Buddha, your one-shot Googled-up side-note is not the magic bullet you keep trying to claim it is. You brought up this transform only because you thought it was what your critics needed to bridge the gap between your broken concept of the PK study and the stated results. Instead we showed that your concept was wholly wrong. The power transform literally has no relevance here.

This is an incorrect calibration procedure because the subjects were told the purpose of the experiment before the calibration, if I understood the procedure correctly (Jeffers didn’t say exactly what the subjects were told about the experiment before it started). It is incorrect because the knowledge affects the subject’s mental state and introduces a bias.

No. This entire paragraph is a non sequitur. You are simply calling out two different concepts -- the calibration regime and the qualitative difference in instruction -- and hoping that your audience will buy your attempt to connect them in a way that casts a vague aspersion. The calibration procedure was an attempt to correct short-term drift. As you may notice in the 10-hour baseline run, there is long-term drift in the form of the visible mean having a half-dozen or so inflection points. There is also short-term drift in the form of the graph being "fuzzy." None of that has the slightest to do with any bias possibly introduced by dictating means versus outcome. Both control data sets were well correlated.

There are other inconsistencies in Jeffers’ setup, I might discuss them later, but now I do not have time for that.

I am not interested in your excuses. Nor am I interested in your minor nit-picks of Jeffers in an apparent attempt to distract from the grievous errors I have identified in your previous work.

When you were told that Stanley Jeffers performed experiments using a single-slit diffraction apparatus, you insisted that no such work was done, and that no report of any such work existed. Do you acknowledge that you erred in your claim?

My primary purpose in reviewing the three papers -- Jahn's IEEE paper and the two Jeffers papers -- was to demonstrate that you did not understand what the dependent variable was in those studies. This a serious failure on your part. It speaks to your fundamental ability to read, understand, and knowledgeably review the work you are discussing. You did not address this in any way. Do you concede that you were incorrect in determining the dependent variables in the Jahn IEEE paper, and in the two Jeffers papers that deal with diffraction? Do you further concede that you lack the skill to do so? Do you agree that arguments predicated on your ability to knowledgeably review that work should be rejected as lacking foundation?

Edited by Loss Leader: 
Edited. Moderated thread.
 
Last edited by a moderator:
Moments are that important.

They are, in fact, the values that determine to what extent your sample distribution may be approximated by the normal distribution without unacceptable error.

In Jay's example, the distribution is a bit skewed to the left, in "my" example it is much less skewed, but this time, to the right. That explains the "lopsided" nature of the coarse distribution.

When data values can only vary by certain increments, this is reflected in the various moments. If we follow up on your image of mathematical moments as their geometric analogues, coarsely spaced data simply cannot occur wherever with respect to some basis. They can only occur on the "grid" of available values, like the colors and pixels in 8-bit graphics. This is why we have the Poisson distribution in addition to the Gaussian.

Getting exactly 400 heads is a really rare event, not as much as winning the Loto but still most unlikely.

Which is why the tightly correlated baseline data was so suspicious in Jahn's original study. I've illustrated the mathematical principles that dictate why data taken according to a certain sampling procedure needs to vary by at least a certain amount. Buddha hasn't addressed this at all.
 
Buddha said:
This time he used a single-slit diffraction for his research, which is a correct choice; he should have used this type of diffraction long time ago.

Jeffers, A Low Level Diffraction Experiment for Anomalies Research.

https://www.scientificexploration.or..._4_jeffers.pdf

Well, well, well, you finally found what I mentioned in post #769 and you doubted it existed in post #846! :)

But I see from your post that you not only disregard completely the variables in a problem (I'll come back to that lately) but also the dates.

This paper from Jeffers' was published back in 1992, 6 years before the paper with the double-slit experiment, which was an improvement for reasons that are escaping your understanding.
 
...as he tried before before when he concentrated his effort solely on Jahn’s research.

Just to clarify, what do you mean by "before" here? Before the single-slit experiment there was only Jefferrs' commentary in Skeptical Inquirer, of PEAR's initial efforts, from which we get the analysis of the baseline problem. This began Jeffers' collaboration with PEAR. Robert Jahn actually suggested the single-slit experiment as a way to tap into the hypothesized PK phenomenon more directly at the quantum level. As aleCcowaN notes, this was followed up several years later by the double-slit experiment. You seem to have the time lines reversed, but it seems fair to give you the opportunity to clarify your position. Contrary to your claim that the double-slit experiment was a patently ill-conceived attempt intended by Jeffers to discredit PEAR, you'll note that it was actually co-authored by a PEAR researcher and participated in by PEAR subjects.

This poses two problems for me. First, I can't tell which aspect of Jahn's research you're referring to in your response -- his initial work, or his collaboration with Jeffers. Please clarify this so that I can revise my remarks as needed. Second, the facts simply don't support your characterization of Stanley Jeffers as the mustache-twirling villain in the drama you're trying to conjure up. He is a well-qualified physicist who approached the topic of psychokinesis seriously. He reached out to, and was embraced by, both ideological sides of the issue. The people you accuse him of trying to sabotage instead happily collaborated with him, and he sought out his would-be critics (Alcock) to secure their assistance and approval.

The latter point is why it is important that you acknowledge your error in claiming Jeffers did not experiment with single-slit diffraction. You will recall you also claimed that Jeffers did no original PK research at all, a falsehood you begrudgingly conceded when the evidence from your own sources contradicted it. Without a similar concession here, you would be establishing a pattern of cherry-picking the relevant literature to appear to support the preconceived notion you have of Jeffers as a biased researcher. That preconception of bias has colored your analysis from the beginning -- you started the discussion by claiming critics of PEAR were incompetent and biased before you even knew who they were. Are you paying attention only to the facts that reinforce your preconception? If so, why should the audience trust your analysis?

The selection procedure for Jeffers’ experiment is nonrandom, while the one for Jahns’ experiment is random, which means that the randomness conditions essential to Jahn’s experiment was not duplicated in Jeffers’ experiment.

No, this is the customary method of obtaining and briefing test subjects for academic psychology research. In Jeffers' case they were clearly sampled from the population and were not screened. You present no facts from Jahn to support the contention that his subjects were somehow any better a subject pool or better "randomized." Further, we discussed the issues of screening and homogenization when you tried to draw the parallel to medical testing. You did not display a correct understanding then of how subject pools are controlled. In fact, you deployed the rather ludicrous statement that data cannot be excluded from the study for any reason once the subject pool is established without violating some vague concept of "randomness." You ignored evidence from the literature to the contrary, considerable discussion on the difference between your proffered example and other cases, and you have ignored my late demonstration of a straightforward normalization step that preserves statistical integrity. Further, in your thread on reincarnation you admitted you were not familiar with the concept of empirical control in general. You are not an expert in how subjects are selected and prepared for research, or how data are treated statistically to accommodate loss.Your judgment here lacks foundation and has no evidentiary value.
 
I am going to reply to several posts at once. I have explained the difference between the single-slit and double-slit diffraction patterns and provided links to the relevant articles. Frankly, I have nothing more to say about these types of diffraction. To me the topic is closed. I let the audience judge who is right and who is wrong. If someone doesn’t understand my posts, they should learn more about the single-slit and double-slit experiments. I just want to remind the audience that I provided the links to the printed data supporting ALL my assertions, while my opponents did not submit a single article or a book in favor of their treatment of the subject.

One of my opponents wrote that I ignore the articles written by scientists criticizing Jahn’s research. This is not true. I chose Palmer’s review highly critical of the Princeton research for my response to the critics. It appears that Palmer wrote the most comprehensive review critical of all forms of ESP research, so I chose it for my response to the critics of two ESP phenomena (metal bending and human telekinetic influence of the measuring apparatus). There are other articles as well, and I will cover some of them in a near future. Obviously, I cannot analyze all articles written by the critics of ESP research. I’ll note in passing that all other ESP phenomena, except for teleportation, are of no interest to me.

One of my opponents wrote extensively about the topics that have no relevance to my posts and to Palmer’s, Wood’s and Jeffers’ articles as well. I didn’t bother to read his posts in their entirety, and I am not going waste my time responding to them.

The issue of the drawings presented in Wood’s article came up again. This is my response to it.

“Theoretical interpretation of tests leads to severe difficulties. The first example
concerns the principle strains allegedly produced by Stephen North on circular
discs where it is stated 1 Tor ... a single radial stress vector we would expect
corresponding signals [principal strains] to be approximately equal and of
opposite sign’. Such a push-pull system is known only for pure shearing action,
and the authors clearly did not mean that. There is a well known solution for
forces P, acting on the diameter D of a disc of thickness t, giving principal stresses
at the centre of 2P/jt tD and — 6P/Jt tD. If Hasted and Robertson meant that then
they were in error by a factor of 3. In fact nothing at all can be said unless
complete stress fields are clearly specified, implying that investigators of the
paranormal should beware of plunging into the field of stress analysis.
Wood.

I am not an expert on structural analysis, but my friend Jeff is, he has PhD in Structural Engineering. Out of curiosity I showed Jeff the above quotation and asked him if he thinks that Hasted made an error by a factor of 3 or any other factor. He said that, based on the material presented, it is impossible to estimate whether Hasted made an error or not. To see if, indeed, the error was made Jeff would have to read Hasted’s article or a chapter from his book and check his calculations. I do not have Hasted’s book, so this matter remains unresolved. However, I would like to remind my opponent that Wood didn’t say that this purported error invalidates Hasted’s interpretation of his experiment.

“Also worthy of mention at this point are brief comments by an
electronics expert named Horowitz (cited by Randi, 1982), who maintained,
apparently rather indignantly, that the signals which appeared on the chart
recorder in Hasted's experiment are readily explicable as electrical
transients picked up by the amplifiers.” Palmer, page 187

Randi, Flim- Flam. This is a book by Randi. I do not have the book, so I will respond to Palmer’s interpretation of Horowitz’ notes.

The amplifiers can amplify the noise because their job is to amplify anything. But after a signal mixed with noise is augmented, the electronic filters filter the noise out because their job is to filter. If there is a signal, it appears on the filter’s output, but if there is only noise, the output is below the required level, and whatever comes out counts as no-signal. Well, this shows that Horowitz is no expert but an electronics technician at most.

This is a general remark – Palmer chose a correct strategy by trying to prove, albeit unsuccessfully, that Jahns’ experiments were conducted and interpreted unsuccessfully. I have shown that Palmer is no expert on statistics or engineering; he should have not ventured out of his field, which is psychology. I hope that he sees my posts and finds courage to respond to them.

Jeffers chose a wrong and illogical strategy by trying to show in his second article that telekinesis doesn’t exist.

Suppose, you run a series of flawless experiments on a group of subjects, and show that the group did not exhibit telekinetic abilities. Does this mean that no one of the earth has telekinetic abilities? Of course not – your experiments simply show that your subjects are not telekinetic. To prove that the telekinesis doesn’t exist, you would have to test the whole earth population. Good luck with that!

Jeffers’ claims of impossibility of having telekinetic abilities show his total lack of understanding of inductive logic. It seems strange that a seasoned experimenter would make a childish mistake of such magnitude, but Jeffers managed to do just that.
 
But I see from your post that you not only disregard completely the variables in a problem (I'll come back to that lately)...

Indeed, I would press Buddha to answer these questions as simply as he can.

1. The dependent variable in Robert Jahn's IEEE invited paper is ________________.

2. The dependent variable in Stanley Jeffers' single-slit experiment is _______________.

3. The dependent variable in Jeffers' double-slit experiment is ___________________.

I would submit that incorrect or missing answers to these questions cast doubt on his ability to read and correctly understand the research he is reviewing. Naturally if one cannot demonstrate a correct understanding of the pertinent facts, then one's subsequent analysis of the subject cannot be considered well-informed and does not therefore deserve much weight as evidence.
 
I have explained the difference between the single-slit and double-slit diffraction patterns and provided links to the relevant articles.

The question is not the difference between single- and double-slit diffraction patterns. The question is why you think the single-slit diffraction pattern is a normal statistical distribution. Because of this error, you wrongly think Jeffers is trying to treat the double-slit diffraction pattern as a normal statistical distribution. He is not, and you are egregiously wrong. Again, this is not a redeemable error. It is not too harsh to say you have no clue what the statistics in any of those papers is actually about.

Frankly, I have nothing more to say about these types of diffraction.

You're not being asked about diffraction. You're being asked about the dependent variables in the papers we have discussed. You cannot demonstrate the ability even to state the problems correctly that were outlined in the papers we're discussing. This casts significant doubt on your ability to analyze them in any meaningful way. Your clearly uninformed opinion regarding their validity certainly cannot be considered to have evidentiary weight.

To me the topic is closed.

To me it is not, as you still have not demonstrated minimal competence in reading and understanding the material you propose to criticize. This leaves the foundation of your arguments regarding PEAR entirely in doubt, as it was based on a profession of expertise. These questions will not go away simply because you want them to.

...while my opponents did not submit a single article or a book in favor of their treatment of the subject.

False. You were directed to Zimbardo's book containing his retrospective of the Stanford prison experiment. You did not consult it. You were referred to Zimbardo's paper, which you initially claimed to read and then recanted. You were given a reference to the excision of outlying data in medical trials. You did not address it. You were given references to material you claimed did not exist, and you did not correct your error.

Most notably, you were given references to papers written by someone you prejudicially dismissed as biased and ignorant, and you were asked questions about it. You have singularly failed to answer those questions. You were redirected to material you yourself referred to, in the form of Jahn's IEEE article. You were shown in detail how you misrepresented it, and you have elected not to respond.

You have confused the ability to provide footnotes with the ability to read and understand the reports of psychology research, and the ability to discuss actual examples of statistical modeling in the sciences.

One of my opponents wrote extensively about the topics that have no relevance to my posts...

Mere gaslighting. The posts I wrote directly and comprehensively describe the errors you have made in attempting to interpret Jahn, Palmer, and Jeffers. The first one provides a foundation of understanding in the science of statistical modeling, which you profess as a data analyst but cannot seem to demonstrate when reviewing others' work. It is meant to instruct the other readers in basic methods of constituting dependent variables so that they can judge for themselves whether you have done it correctly. The second directly refutes a claim you made where you named several disciplines and insinuated that they needed no great knowledge of statistical methods. The third and fourth directly address your claims as to what in the Jeffers studies was actually being measured and treated statistically. They describe the actual dependent variables and how they were derived. This can hardly be considered irrelevant because you have spent considerable effort trying to show that those studies were statistically invalid for exactly those reasons.

I didn’t bother to read his posts in their entirety, and I am not going waste my time responding to them.

Your unwillingness to face your documented errors does not make them go away. Nor will this be the last they are spoken of. The tenor of your entire post today seems to be merely doubling-down on your insinuation to superior knowledge over those of your critics here, and over authors who published professionally on these topics for years. And on that basis alone you keep suggesting that they are "somehow" still wrong. That attitude is inconsistent with your unwillingness to meet challenges to your understanding.

The issue of the drawings presented in Wood’s article came up again.

Be that as it may, the issue of your misunderstanding of the PEAR research is not a settled matter. Therefore please do not keep trying to change the subject. Since you have stated that you have limited time, I recommend we not divide our attention. The PEAR issue is still open so long as you do not correctly identify the dependent variables in the studies and own up to your prior errors. That is a basic thing to understand about a research report before attempting to criticize it.

I have shown that Palmer is no expert on statistics...

You've done no such thing. You've pretended to do so by comparing his knowledge to yours, but in fact the record shows it is your knowledge that is suspect. Since you refuse to discuss the challenges to your knowledge, it follows that you are unwilling to defend the foundation behind this claim. We properly reject it.

he should have not ventured out of his field, which is psychology.

But you are not a psychologist. Your judgment regarding whether an experimental psychologist can be competent in statistical analysis is entirely without foundation and therefore carries no weight as evidence. Your uninformed understanding of what may be in or outside the field of psychology is without foundation and carries no weight as evidence. You cannot demonstrate basic competence in statistical modeling, so your judgment regarding how well someone else has done it is without foundation and carries no weight as evidence.

I hope that he sees my posts and finds courage to respond to them.

Are you accusing Dr. John Palmer of cowardice? You mentioned earlier that you intended to write a paper defending PEAR. I suggest if you want to attract Dr Palmer's attention, you write that paper and try to get it published in the journal he edits, Journal of Parapsychology. Or perhaps in Journal of Scientific Exploration, the journal that published PEAR's research originally. Demanding that he somehow know of this forum and of you is a bit optimistic. Rather than demand that others meet you on your terms, why don't you instead demonstrate that you can move in their circles, and take affirmative steps to put your wisdom before the authors you criticize? Wouldn't this be more effective, and reach a wider audience, than hammering away here in a web forum?

Jeffers chose a wrong and illogical strategy by trying to show in his second article that telekinesis doesn’t exist.

No such motive is mentioned in the article, nor is any such sentiment in the statement of findings and interprretation. Further, the article was co-authored by a PEAR researcher. That makes it a little difficult for you to argue that it was a hatchet job.

Jeffers’ claims of impossibility of having telekinetic abilities...

Jeffers makes no such claim.

It seems strange that a seasoned experimenter would make a childish mistake of such magnitude, but Jeffers managed to do just that.

Or, as I have demonstrated, you don't understand his experiments and can't explain why none of Jeffers' other critics managed to notice the allegedly prominent errors you accuse him of. Yes, it is indeed strange that a researcher of his stature would stoop to the errors you have attempted to pin on him. I propose that the most parsimonious explanation for that predicament is that your accusations are as vacuous and ill-informed as I have shown them to be.
 
One of my opponents wrote extensively about the topics that have no relevance to my posts and to Palmer’s, Wood’s and Jeffers’ articles as well. I didn’t bother to read his posts in their entirety, and I am not going waste my time responding to them.
It is really too bad that you demonstrate a lack of knowledge in the area where you criticize others, and when JayUtah writes a series of long comprehensive posts to educate you, you treat it as TLDR, and refuse to learn.
 
Because of the physical setup, it is hard to imagine how the subjects
could have physically bent the specimens while they were attached to the
"_ recording devices without detection by an experimenter (or, the video
recording, when used), or without leaving an obvious tell-tale trace on the
chart record. This comment does not apply to the twisted metal strips,
however, which were left unobserved in a room. In this case, documentation
is insufficient to rule out someone entering the room undetected and
manipulating the specimen. Although twists as tight as those observed seem
difficult to produce, even granting that shear forces are involved, the
*difficulty or possibility of mechanically producing such deformations cannot
be assessed without extensive control tests” Palmer, page 189

It is much harder to twist a metal rod than to bend it, everybody knows that. A subject would have to smuggle an instrument into the session room to twist a rod. Palmer’s “suspicions” are completely unwarranted, at least he should have suggested a plausible method to manipulate the specimens, but he didn’t, which makes his critique very weak.

In none of the cases is information given to reassure the reader that
either physical deformation of the specimens or substitution of an already
deformed specimen was precluded as a possibility at some point during the
session (e.g., before the specimen was mounted). In particular, I could
“ Although no positive evidence of such manipulations exists, Hasted's lack of sensitivity to this
issue in his reports reduces the confidence one can place in the observed
deformations being truly anomalous. The fact that his subjects were
teenagers is not an argument against trickery being employed, although
Hasted sometimes implies that it is” Palmer page 189

As they say in the court, innocent until proven guilty; this criteria also applies to scientific research. A scientist or his subject should not be accused of rigging the equipment if there is no definitive proof of his dishonesty.

“In none of the cases is information given to reassure the reader that
either physical deformation of the specimens or substitution of an already
deformed specimen was precluded as a possibility at some point during the
session (e.g., before the specimen was mounted). In particular, I could
find no mention of specimens having been marked. Although no positive evidence of such manipulations exists, Hasted's lack of sensitivity to this
issue in his reports reduces the confidence one can place in the observed
deformations being truly anomalous. The fact that his subjects were
teenagers is not an argument against trickery being employed, although
Hasted sometimes implies that it is seems unlikely that a subject could consistently get away with touching a
specimen without being detected. This is especially true in the case of
Nicholas Williams, who customarily stationed himself several feet from the
specimens. Also, it again should be noted that some sessions were
videotaped, and touch detectors were sometimes employed. Blowing on the
specimens would be more difficult to detect, however. According to Isaacs
(1984), only air currents powerful enough to cause the rigidly mounted
specimens to swing would be powerful enough to be detected by the
amplication and recording system.” Palmer, page 190

This goes back to the same argument that Palmer made on page 189 – it is possible to manually affect the metal bars used in Hasted’s experiments, which is ridiculous. Blowing on the specimens wouldn’t do any good, either, because the amplifiers augment the signals that they receive directly from the sensors that attached to the bars, the movement of a sample would not change sensor data in any way, If I have time today, I will read Isaacs’ article, and discuss it tomorrow

Isaacs, Some Aspects of Performance at a Psychokinetic Task (unpublished doctoral dissertation)

http://publications.aston.ac.uk/12309/1/Isaacs_JD_1984.pdf

Isaacs’ dissertation is quite long and in general it is favor of psychokinetic research. I didn’t have time to find the nonsense that he wrote about the air movement caused by subject’s blowing. Perhaps, Palmer misinterpreted Isaacs’ words. Anyway, tomorrow I will take a close look at the dissertation/
 
Here are a few resources I was able to locate after a brief web search to document the need for statistical proficiency in experimental psychology. Buddha claims Dr. John Palmer, as a professional experimental psychologist, has "ventured outside his field" by examining such things as statistical methods and experiment design performed by other PK researchers.

https://www.verywellmind.com/why-are-statistics-necessary-in-psychology-2795146 Advice to prospective psychology students on what they can expect in terms of education in statistics.

https://www.springer.com/us/book/9780852003695 A standard text on the subject, written for psychology students and practitioners. The chapter on normal distributions (p. 34) and their relationship to frequency data closely parallels the discussion I presented in the first of my four-part series. Buddha dismissed all that as irrelevant. Here, published authors in the field disagree.

https://en.wikipedia.org/wiki/Psychological_statistics The Wikipedia article on psychology statistics. Note carefully the section on factor analysis. That is what I alluded to in the seemingly irrelevant side-track into design validation in aerospace. It is not, as Buddha claimed, irrelevant or alien to either field. The fictional engineers in the sketch were admonished to use statistical methods to perform a factor analysis on their design.

Further, the factor analysis procedure described briefly in this article is what Stanley Jeffers invokes specifically in his work. He writes, "Given the ample evidence in the literature of statistical anomalies correlated with human intention, the major motivation for this effort was to improve our understanding of the dependencies and invariants of the process, rather than simply to provide more evidence of such anomalies " [Jeffers et al., op. cit. 1998, p. 547] Buddha wrongly reports the purpose of this experiment as disproving the possibility of psychokinesis. Jeffers continues [Ibid., p. 549], "The experiments conducted at Princeton University showed marginal evidence of an anomalous effect at a scale consistent with that of similar experiments with larger databases and corresponding larger effects." And he elsewhere urges further work using the lessons learned from these experiments. I submit this as further evidence that Buddha either does not understand the papers he criticizes, or is deliberately misrepresenting their contents.

https://www.ejwagenmakers.com/2011/WetzelsEtAl2011_855.pdf A paper from Perspectives on the Psychological Sciences styled as metaresearch into prevailing statistical methods in psychology. It also confirms the use of t-tests in such studies.

https://psychology.columbia.edu/content/research-methods-statistics-courses A statement of statistics requirements and elective courses in the Psychology department at Columbia University, the university with which Buddha has most closely claimed alignment.

https://www.psychologicalscience.or...f-learning-statistics-for-psychology-students An essay written for a professional organization in the behavioral sciences decrying the lack of preparation in mathematics exhibited by students entering the field.

https://archived.parapsych.org/members/j_palmer.html The curriculum vitae of Dr. John Palmer, as reported by the professional association for which he served as head.

From these references a number of things become clear.
  1. John Palmer is well qualified as an experimental psychologist.
  2. The topics I covered in my four-part offering are the same topics covered by published authors in the field and are not irrelevant to analyses of the validity of research methods.
  3. The qualifications for professional practice in experimental psychology require considerable proficiency in statistical methods.
  4. Academic programs purporting to train for such professions teach statistical methods.
  5. Lay knowledge of the profession, as exhibited by prospective students, generally does not anticipate or comprehend the role of statistics.
  6. Dr. Stanley Jeffers, in posturing his single- and double-slit experiments as factor analysis, has performed appropriate research in experimental psychology.

For these and additional reasons, Buddha's claim that Dr John Palmer is categorically not qualified in statistics is rejected.
 
Jeffers chose a wrong and illogical strategy by trying to show in his second article that telekinesis doesn’t exist.

Suppose, you run a series of flawless experiments on a group of subjects, and show that the group did not exhibit telekinetic abilities. Does this mean that no one of the earth has telekinetic abilities? Of course not – your experiments simply show that your subjects are not telekinetic. To prove that the telekinesis doesn’t exist, you would have to test the whole earth population. Good luck with that!

I'm sure this is why the saying, "You can't prove a negative," exists. That simply isn't the way science works, trying to prove that something doesn't exist.

No. You have a hypothesis that telekinesis exists. You devise an experiment to test your hypothesis. Your results either support your hypothesis or they do not. IF they do, you and others repeat and expand on your results until you have replicated, peer-reviewed results which might inform a new theory. If they do not, you change up the experiment until either you consistently fail to get results that support your hypothesis and give up or you finally find support.

OK, that's probably an oversimplified 80's public school version of the scientific method but it suffices to make the point. The only thing that matters is scientific evidence that supports the hypothesis of telekinesis. It's meaningless to say, "well, I'm know telekinesis exists we just haven't looked hard enough." Uh-huh. If you haven't found evidence for your hypothesis, then you have not supported your position. Period.
 
It is really too bad that you demonstrate a lack of knowledge in the area where you criticize others, and when JayUtah writes a series of long comprehensive posts to educate you, you treat it as TLDR, and refuse to learn.
I am going to replay to several posts at once, including yours. In my opinion Jay’s posts are irrelevant to the discussion so I ignore them for most part, but you have a different opinion. If a person believes that Jay’s posts are useful, he/she should study them diligently; this is a matter of personal preference. My goal is to appeal to the audience as a whole, not to Jay. The audience is smart, and I think vast majority of the members understand my posts very well, and see that I provide the data relevant to the discussion, and reject the one that has nothing to do with it. This doesn’t mean that everyone agrees with me, but any intelligent member sees that I am not asking them to waste their time on evaluation of extraneous and useless information. The smart ones always win!

However, I will respond to a remark by Jay – he wrote something about establishing a baseline in Jeffers’ study. This is not what I meant – I meant that the knowledge of a test’s purpose affects the test results in an undesirable way.

Again, Jay’s references to Palmer’s works are irrelevant because they do not provide any data of the treatment of outliers, which was my request. Other than that I do not see why I should read Palmer’s articles. If Jay provides links to any data regarding the use of outliers in psychological tests, I will gladly read the articles.

One more thing – Jay wrote that single- and double-slit distributions in Jeffers’s experiments are not statistical variables, as he calls them. If this is true, Jeffers did hell of a lot of useless work that is not needed for his experiments.

Now the rules have been changed, so everyone has to wait until their post is approved before posting the next one, so I will continue my discussion of Palmer’s report.

“In the most recent phase of his research, Hasted has shifted from
strain gauges to piezoelectric sensors (Hasted, Robertson, & Arathoon,
1983). As used by Hasted, piezoelectric sensors measure the rate of change
of stress rather than the level of stress per se. This makes them more
sensitive than the strain gauges to the rapidly varying pulses that seem to
* characterize the ostensible PK effects. However, in order to minimize
electrostatic artifact, Hasted had to eliminate much of this added
sensitivity by connecting the high resistance piezoelectric transducer
across a relatively low resistance (3.5 K ohms). Nonetheless, the overall
piezoelectric system was still more sensitive than the strain gauges to the
signals of interest.” Palmer, page 186

Yesterday I quoted Palmer’s remark about the possibility of changing test results by blowing on a specimen. This article provides basic data on piezoelectricity.

https://www.nanomotion.com/piezo-ceramic-motor-technology/piezoelectric-effect/

Some piezoelectric sensors are very sensitive and respond even to slightest winds, as in blowing. However, the other one are much less sensitive because they are built of different materials. A manufacturer chooses appropriate sensors for a specific application. Clearly, the sensors used to measure structural changes in metal are designed in such way that they do respond to the air movement in a lab.

As for a specimen’s position, it doesn’t affect the readings, contrary to Palmer’s suggestion.

:” Hasted expresses caution about these unwitnessed events but it is
difficult to explain them as fraudulent since personal communication
with Hasted (1984) has established that the search for hidden
confederates left little opportunity for concealment. Instrumental
records of four of these folding events ~re obtained (see apperrlix
I),” Isaacs, page 60.

Apparently, there was no possibility of fraud in Hasted’s lab.
 
My goal is to appeal to the audience as a whole, not to Jay.

Buddah,

If you want to appeal to the audience as a whole, you need to respond to JayUtah's posts in a constructive and honest manner. Up until the moment the moderators took over he was the one in control of the thread and its topics, not you. Even now, he is the one steering the discussion, not you. I literally created a forum poll so people could anomalously voice support for you. You did not fare well. JayUtah however, came away the clear winner by a large margin.

The audience is smart, and I think vast majority of the members understand my posts very well, and see that I provide the data relevant to the discussion, and reject the one that has nothing to do with it.

You are the only one rejecting JayUtah's posts. Your belief that people are accepting you and rejecting him has absolutely no foundation in reality. <snip>


Edited by Loss Leader: 
Edited. Moderated thread.
 
Last edited by a moderator:

Back
Top Bottom