• Quick note - the problem with Youtube videos not embedding on the forum appears to have been fixed, thanks to ZiprHead. If you do still see problems let me know.

Check my methodology - prayer study

I don't think I can do so, because this study has never been performed before in the manner I want to do it. Plus, the affidavit you suggest wouldn't meet the standard anyway.

However, I don't think it comes under the clause for which that "3 affidavit" requirement was made (namely, really extreme claims of personal power), so hopefully it should not be relevant. Also, you're conflating two different things: affidavits confirming that it's something worth JREF's bother (i.e. to filter out excessively extreme claims, like creating lights around oneself spontaneously), and people to participate. The latter should not be difficult.

You still haven't answered my request though.

If you mean point out things that may cause a false positive, I've not claimed there exist such flaws. If you mean flaws in general, some have even been mentioned by yourself (potential for false negatives, sample size questions, population distribution)

I would not count on the affidavit requirement being waved, especially given the JREF's currently limited ability to process claims.
 
Startz - Thanks, again, for the pragmatic perspective. :)

(Sorry on the civic duty thing being so mindnumbing though. Good thing you brought something to amuse yourself with.)

The moral (besides that jury duty can be mind-numbing) is that perhaps the attention being paid to statistics is displacing attention that might be better spent looking for loopholes and tricks.

Indeed. I do not want there to be any potential for someone to claim a loophole or trick, and have no intention of trying to use one. So far I think I've closed everything, and am waiting for someone to point out anything I've missed.

I think as we've pointed out (and your statsruns and story demonstrate), the statistics are sound.

One point I'd like suggestions on: I would like a protocol that both assures JREF the ability to verify the source of submissions to be not inappropriately influenced by me (e.g., that the mailed signed verifications are not forgeries), but also ensures participant privacy to the maximum degree possible (e.g., by completely separating their data from their contact info). Ideas?

petre - Your previous posts certainly seemed to be claiming that there were "obvious flaws" in the study that would make it unacceptable to a skeptic. But no matter.

Could you please respond to my request about how the potential for false *negatives* could be mitigated, and what exactly you see as the "obvious flaws" relating to that?

So far I have agreed that I am not testing prayer that is limited to friends or people in constant contact; I think this is a reasonable limit and necessary to a study that is practicable. Am I limiting it excessively in any other way?
 
petre - Your previous posts certainly seemed to be claiming that there were "obvious flaws" in the study that would make it unacceptable to a skeptic. But no matter.

The point of view I try to take when assessing protocols is "JREF application processor" rather than "skeptic", which is only slightly different. It's my hope that such analysis is of greater help to potential applicants. You may feel the points made by other posters and myself will not concern JREF, and if such is the case I wish you luck in your continued protocol negotiations.

On the matter of flaws, perhaps if I enumerate them from my previous post they will stand out more clearly:

1. Potential for false negatives. This is a point you, yourself, have made I believe. I suggested using previous studies as a guide to address this specific issue, but you've indicated that you haven't the resources for such an extensive study. Perhaps there is no answer to correct this flaw within your means.
2. Concerns about sample size. I do not feel you've sufficiently convinced many, or at least me, that you'll achieve a significant sample size, which I believe is the only specific criticism you've made of the existing studies. Perhaps you could set a benchmark by naming a specific study you feel was handled properly and was lacking only in sample size, then we'd have an idea how large a sample you'll need to find results that study missed due to small sample size.
3. Population distribution (self-selection and other bias, uneven post-randomization distributions, etc). While this may not fit the description of "obvious" for every viewer of this forum, it does seem clear that there are still some concerns about how this question. Again, modeling the adjustments made in previous studies could be of benefit there.

Certain skeptics would be more than happy to approve of your study I'm sure, since you'd either declare at the end of the study that no significant result was found (consistent with existing studies) or they simply could claim that your methodology was flawed and discount any positive result you might get. I'd prefer to do the legwork beforehand to make a positive result more meaningful, and I believe the JREF would as well.

Another matter that crossed my mind was the time frame. The JREF has not yet approved a protocol of such long duration (I seem to recall an application predicting something several years in the future, and I believe said applicant was advised to re-apply within one year prior to the event, as applications were only good for one year). While they may be willing to make an exception given the nature of this claim, I have doubts that they will accept a protocol that states "the test will continue until enough people have participated, even if it takes several decades".
 
snip...

But since about all I have to contribute is on statistics, I re-ran the simulation using two different variances and reducing the sample size to 20. In 50,000 simulations there were 5.36 percent false positives, about as one would expect.

Try it another way, place 80% randomly in one group with N(8,2) and 20% with N(3,1) and the reverse for the other group. What result do you get now? What happens when you adjust for the population they came from? Do you see how having a higher proportion from one population will skew the result if you don't adjust for it?

As for N(x,y) above, N is the normal distribution, x is mu, and y is sigmasquared...

And jury duty sucks...

ETA: added the word randomly
 
>snip

I think as we've pointed out (and your statsruns and story demonstrate), the statistics are sound.

Just so that I'm being clear, I haven't said that the statistics are sound. All I've said is that some of the questions raised don't seem to point to large problems.

In fact, I don't think you've been specific enough about what test is going to be used and how the data is going to be treated. Among other things, we don't know how a "score" is going to be computed for each person. Until that's nailed down, I don't think JREF should entertain a protocol.

I don't know much about conducting studies where a participant might try to influence the outcome. That's why we have magicians.

But here's one way to stuff the ballot box. Have one recipient show up with some unusual identifying characteristic. Have one healer in cahoots who knows whether that person is going to be prayed for and can signal the "experimenter." I think that's all it takes to guarantee a win, no matter how large the sample is.
 
Wrong. I am not testing the diff between (people who get no remote prayer) and (people who do).

I am testing the diff between (people who are normal, and may be getting prayer in their usual manner) and (people who ALSO get extra prayer through me).

You are making a believer's argument that the results would be a false negative... which given the forum means you're probably confused.

But you do not know that the people in the test group are actually getting more prayer than the people in the control group. That's the whole problem. You don't know, and you can't know. How can you pretend to claim that your results, positive or negative, will mean anything whatsoever?

I'm not sure exactly what you mean by saying I am making a "believer's argument." Is that supposed to be some kind of insult?
 
The point of view I try to take when assessing protocols is "JREF application processor" rather than "skeptic", which is only slightly different. It's my hope that such analysis is of greater help to potential applicants. You may feel the points made by other posters and myself will not concern JREF, and if such is the case I wish you luck in your continued protocol negotiations.

Understood.

1. Potential for false negatives. This is a point you, yourself, have made I believe. I suggested using previous studies as a guide to address this specific issue, but you've indicated that you haven't the resources for such an extensive study. Perhaps there is no answer to correct this flaw within your means.

Do you see any false negative potential OTHER than what I said (familiarity/contact requirement)? E.g., insufficient information provided to Healer, other requirements, ...

2. Concerns about sample size. I do not feel you've sufficiently convinced many, or at least me, that you'll achieve a significant sample size, which I believe is the only specific criticism you've made of the existing studies. Perhaps you could set a benchmark by naming a specific study you feel was handled properly and was lacking only in sample size, then we'd have an idea how large a sample you'll need to find results that study missed due to small sample size.

I'll have to respond to that later as I don't have my reference on hand (it's buried in some boxes somewhere). IIRC their sample size was about 25.

However, I've seen very few studies of remote intercessionary prayer that I felt were methodologically sound to begin with, and none that were run the way I'd like to do it.

3. Population distribution (self-selection and other bias, uneven post-randomization distributions, etc). While this may not fit the description of "obvious" for every viewer of this forum, it does seem clear that there are still some concerns about how this question. Again, modeling the adjustments made in previous studies could be of benefit there.

Indeed. However, I think that Startz' applied modeling so far shows that this is not a significant (sic) problem. If someone can demonstrate that it is - ie it'll cause a false positive more than 5% of the time - then I'll certainly reconsider.

Certain skeptics would be more than happy to approve of your study I'm sure, since you'd either declare at the end of the study that no significant result was found (consistent with existing studies) or they simply could claim that your methodology was flawed and discount any positive result you might get. I'd prefer to do the legwork beforehand to make a positive result more meaningful, and I believe the JREF would as well.

Agreed.

Another matter that crossed my mind was the time frame. The JREF has not yet approved a protocol of such long duration (I seem to recall an application predicting something several years in the future, and I believe said applicant was advised to re-apply within one year prior to the event, as applications were only good for one year). While they may be willing to make an exception given the nature of this claim, I have doubts that they will accept a protocol that states "the test will continue until enough people have participated, even if it takes several decades".

Quote from my correspondance with Jeff:
The "general" setup is fine. We won't close the application as long as work is being done on it, and we haven't met an permanent impasse.

Jeff Wagg
JREF

On 8/24/06, Sai wrote:

Of course. Would you please verify before I go through the trouble of
doing so, though, that:
1. the general setup (i.e. a multi-group study involving many people
over a long period of time) is acceptable to JREF
2. JREF is willing to keep the file open without re-application for
the duration of the study so long as it is ongoing

Thanks,
- Sai

... so I think that should be okay within reasonable limits. I have explicitly said to Jeff that it will take more than a year per phase.


Startz said:
Just so that I'm being clear, I haven't said that the statistics are sound. All I've said is that some of the questions raised don't seem to point to large problems.

Point.

In fact, I don't think you've been specific enough about what test is going to be used and how the data is going to be treated. Among other things, we don't know how a "score" is going to be computed for each person. Until that's nailed down, I don't think JREF should entertain a protocol.

I've tried to be specific as to the paramaters of the score equation. I can't and won't say in advance what it will be, since it'll be based on the previous round's data (to better tune the equation).

However, I have yet to see anyone point out a real methodological flaw with *any* possible score equation I could come up with that doesn't access the assignments database.

I don't know much about conducting studies where a participant might try to influence the outcome. That's why we have magicians.

But here's one way to stuff the ballot box. Have one recipient show up with some unusual identifying characteristic. Have one healer in cahoots who knows whether that person is going to be prayed for and can signal the "experimenter." I think that's all it takes to guarantee a win, no matter how large the sample is.

I don't think I understand your example. Could you elaborate?

Also please note that there is a pretty small chance of any particular Healer knowing any particular Recipient, and that I will be requiring them to sign something saying they haven't communicated with any Recipient...

Gr8wight said:
But you do not know that the people in the test group are actually getting more prayer than the people in the control group. That's the whole problem. You don't know, and you can't know. How can you pretend to claim that your results, positive or negative, will mean anything whatsoever?

1. I will be tracking how many people the Recipients believe are praying for them. This should be significantly equal between active and control groups.
2. They should also be significantly equal in *actual* measure, since the selection process (randomization) in no way has a potential to bias on that measure.
3. The active group is getting baseline + study; the control group is just getting baseline.
4. Ergo, the active group is getting more than the control group.

Also I should point out that AT MOST what you are claiming is that this would be a false negative, not a false positive (and even that is a statistically very unlikely claim).

I'm not sure exactly what you mean by saying I am making a "believer's argument." Is that supposed to be some kind of insult?

No, it's not. It simply means that you are taking the point of view of a believer, not a skeptic, and are arguing that my flaw as you perceive it could cause a false negative. You are NOT arguing that it could cause a false positive.

I simply want that to be very clear; I certainly don't mind that that is the case (as with petre above).
 
BTW Startz: Thanks for the monte carlo simulations. They're quite helpful for cutting through arguments about stats theory.

My matlab skillz are pretty out of date (last time I used it was when I was taking BC Calc AP, my soph year of HS... 1997-98) so I'd probably end up coding it in Ruby instead if I tried to. :p
 
1. I will be tracking how many people the Recipients believe are praying for them. This should be significantly equal between active and control groups.
2. They should also be significantly equal in *actual* measure, since the selection process (randomization) in no way has a potential to bias on that measure.
3. The active group is getting baseline + study; the control group is just getting baseline.
4. Ergo, the active group is getting more than the control group.

Also I should point out that AT MOST what you are claiming is that this would be a false negative, not a false positive (and even that is a statistically very unlikely claim).

If randomisation is your only control, your sample size needs to be huge for it to be effective. A few hundred is probably not enough.



No, it's not. It simply means that you are taking the point of view of a believer, not a skeptic, and are arguing that my flaw as you perceive it could cause a false negative. You are NOT arguing that it could cause a false positive.

I simply want that to be very clear; I certainly don't mind that that is the case (as with petre above).

I am arguing that your results, regardless of whether they are positive or negative will be meaningless.
 
Just so that I'm being clear, I haven't said that the statistics are sound. All I've said is that some of the questions raised don't seem to point to large problems.

In fact, I don't think you've been specific enough about what test is going to be used and how the data is going to be treated. Among other things, we don't know how a "score" is going to be computed for each person. Until that's nailed down, I don't think JREF should entertain a protocol.

I don't know much about conducting studies where a participant might try to influence the outcome. That's why we have magicians.

But here's one way to stuff the ballot box. Have one recipient show up with some unusual identifying characteristic. Have one healer in cahoots who knows whether that person is going to be prayed for and can signal the "experimenter." I think that's all it takes to guarantee a win, no matter how large the sample is.

Which is exactly what I've been trying to get across, we've got the hypothesis which is to test if intercessory prayer can improve outcomes in sick people but we need answers for the following:

1) What is the measure of the outcome?
2) What is the clinically significant difference of this measure?
3) What are the type of diseases or conditions that will be studied?
4) How will the confounders (disease severity, current treatment, gender, culture, religiosity, dropouts and losses to followup) be adjusted for in the analysis?

All of the important other stuff such as funding, recruitment randomization, staffing, data management, etc. are a moot point until these things are addressed. A CRO performing a clinical trial wouldn't start recruiting people until these things are answered...

And Startz, I thank you for your attempts at simulating but I didn't do a very good job at describing how to simulate a multivariate model with confounders nor do I have the expertise to do it very well anymore as I no longer do research in statistics. They're a very good start but they need some tweaking, you might do a lit search in Biometrics, JASA, or Biometrika to find out how if you're still interested...

But I know how to design a clinical trial as I still do this when I'm evaluating crime control policies and treatment programs and in my opinion, Saizai is nowhere near close to designing an acceptable clinical trial to test the effect of intercessory prayer on disease outcome...

Nor does he seem very interested in heeding some sound statistical and clinical advice that many of us have given him so I think we should wish him luck and just leave him to his study...
 
Placebos generally aren't given for serious diseases. New treatments are compared with accepted treatment, but not with a placebo. Deliberately withholding treatment from a potentially fatal disease like cancer would be considered serious misconduct, and possibly murder, if there is any treatment available that gives a better survival rate than placebo.
Maybe this has been addressed, I'm behind in this thread, but this is simply false. A double blind placebo controlled prospective study is the ideal clinical trial.

Placebos are given to test treatments in life threatening diseases all the time. What you are not thinking through here is the study drug or treatment may be harmful, may not help, or may work. That's what the study is for. So until the treatment is shown to be effective by the treated group doing better than the placebo group, neither group really is "better" to be in.

Often experimental treatments are done on terminally ill patients when all other treatments fail. Or new treatments may be added to current therapy again with placebo control. And often when a new drug is developed, the first people it is tested on may be healthy volunteers. That may be done to see what kind of tolerance people have for the drugs. Again, placebo controlled.

As far as the ethics go, you do not "lie" to the patient except in a few rare circumstances, more often in psychology research than in medical research. Sometimes you distract the patient by telling them you are looking for X while you are really looking for Y. But the usual way placebo controlled trials are done is the patient is told they have a 50:50 chance of being in either group. And until the results are analyzed neither the observer nor the patient know whether drug or placebo was given. That is what double blind means as most of you know.
 
Last edited:
...

No ethics board would ever, under any circumstances whatsoever, agree to giving patients a placebo while telling them that it is actually an effective medicine.
Again, this is simply not true.

Placebo research has been conducted at UCLA for example. Here are some of the reports.

This paper discusses the ethics of deceiving research subjects and suggests the following:
...participants can be informed prior to deciding whether to volunteer for a study that the experimental procedures will not be described accurately or that some features of these procedures will or may be misleading or deceptive [25,26]. This approach, which we call “authorized deception,” permits research participants to decide whether they wish to participate in research involving deception and, if so, to knowingly authorize its use. Authorized deception is compatible with the spirit of informed consent. It fosters respect for persons, despite the use of deception, by alerting prospective participants to the fact that some or all participants will be deliberately deceived about the purpose of the research or the nature of research procedures.

For example, investigators using the balanced placebo design to study expectancy and pharmacological effects of dexamfetamine described the informed consent disclosure as follows: “For ethical reasons it was stated in the consent form that ‘…some information and/or instructions given [to the participant] may be inaccurate’” [15]. This statement recognizes the ethical force of authorized deception, but does not seem to go far enough. As illustrated above, the balanced placebo design involves lying to participants in two arms of the study: some participants are told that they are being administered a particular drug when in fact they receive placebo, and others that they are being administered placebo when in fact they receive the drug. Consequently, it is at best an understatement to describe the disclosure in this experiment as possibly involving “inaccurate” information. It would be more accurate to inform the prospective participants that some research participants will be misled or deceived.

But then keep in mind the consensus is currently that the placebo effect has been overrated. There is still a lot we don't know about the mind body connection. So the jury is by no means in on this matter.

Typically giving a placebo makes the control group similar in every way except the treatment. It also would be impossible to blind the observers and subjects if a placebo were not used. Just knowing you did or didn't get a treatment alters the outcome so it isn't merely the placebo effect of believing you got the treatment.
 
skeptigirl: I fully cede and agree to the point of participants being a self-selected, non-random subset of the general population of cancer victims.

However, I need to ask you to explain how that could possibly create a false positive difference between the active and control groups, since both are drawn from the same (admittedly self-selected) pool and assigned randomly (from that pool).

I also entirely agree that the control group will likely be prayed for, and challenge you to the same question on this point as well.

Your doubt is not an argument. :)
I've tried to find in these 4 pages how it is you can have a control group that is prayed for and a test group that is prayed for and expect to see the effect of prayer.

You agree that self selected participants are not random. It didn't seem you were disagreeing with my hypothesis that self selected participants would include more 'believers' than a random sample and that more 'believers' would have additional people praying for them.

Here are some things to consider. You have to be more specific about what you are actually measuring. In other words,
  • Are you testing if 10 people praying will have a greater impact than one person praying? (quantity)
  • Are you testing if praying has an effect? (quality)
  • Are you testing if the particular people praying in your study have an effect? (specific quality)
  • Are you testing if the particular way prayers are performed has an effect? (specific quality)

You need to clearly define what it is you are actually measuring. Just saying you are testing if prayer has an effect doesn't allow you to determine if your control group is a true control group. I can't see that it is as far as you have gotten here. You have failed to explain how your control group will essentially differ from the test group if everyone is prayed for in both groups.
 
Last edited:
skeptigirl, gr8wight:

I've said repeatedly.... I am testing for the ADDITIVE effect of prayer.

I am not going to try to make sure that the control group is not prayed for at all (that would be impossible as well as unethical). What I can say with certainty is that they aren't getting any extra prayer from their participation, and with statistical certainty that they (as a group) are getting less than the active group.

That diff is what I am testing.

As a subset I will also be doing analysis of other variables I'm tracking - e.g. whether religion, directedness, frequency, etc *correlate* to effectiveness. Same thing for disease outcomes (eg perhaps pain is more affected than survival rate or $ spent in treatment). This part is not intended (at this point) to be a causal study, just a correlative analysis; I'll be using it to construct the score equation & additional prerequisites for participation in later rounds. Those will still be done in the same, causal double blind model; if the correlative data is accurate then one would expect a larger effect demonstrated as it's better tuned.

I can do a similar thing with "specific quality" by eg making a bell curve of Healer effectiveness (from the average Scores of their Recipients).


digithead - I think there's been no question that I have answered your points 1-3 explicitly. Reiterating them is somewhat silly.

Your point #4 is understood, but you haven't demonstrated that Startz' monte carlo simulation of your objection is inaccurate, and it shows that your objection is unfounded.

I heed advice, but not blindly - I get to be a skeptic too, y'know. You have to make your case, and so far you have not. You've just stated that you *think* these confounders would result in a false positive, but you haven't given any monte carlo sims or math to back it up or refute Startz' counter examples.

As for "funding, recruitment randomization, staffing, data management, etc", those haven't been even raised as issues (randomization ain't exactly difficult with a computer btw).

IMHO you're trying to make this look "very far away from settled" when in reality it's pretty much a solid protocol with very little tweaking required to finish and be completely sound.
 
Last edited:
Maybe this has been addressed, I'm behind in this thread, but this is simply false. A double blind placebo controlled prospective study is the ideal clinical trial.

Placebos are given to test treatments in life threatening diseases all the time. What you are not thinking through here is the study drug or treatment may be harmful, may not help, or may work. That's what the study is for. So until the treatment is shown to be effective by the treated group doing better than the placebo group, neither group really is "better" to be in.

Often experimental treatments are done on terminally ill patients when all other treatments fail. Or new treatments may be added to current therapy again with placebo control. And often when a new drug is developed, the first people it is tested on may be healthy volunteers. That may be done to see what kind of tolerance people have for the drugs. Again, placebo controlled.

As far as the ethics go, you do not "lie" to the patient except in a few rare circumstances, more often in psychology research than in medical research. Sometimes you distract the patient by telling them you are looking for X while you are really looking for Y. But the usual way placebo controlled trials are done is the patient is told they have a 50:50 chance of being in either group. And until the results are analyzed neither the observer nor the patient know whether drug or placebo was given. That is what double blind means as most of you know.

I stand by my original statement. Obviously I agree that a double-blind, placebo controled trial is the ideal, but this does not always happen. Much of the debate over evidence-based medicine (within the medical community, not in the media) has come from the realisation that there are many treatments, especially in surgery, that have never been tested properly.

If a treatment is accepted for a life-threatening condition then any new conditions are usually only tested against the previous treatment, since if someone dies while on placebo the people conducting the trial will be guilty of witholding treatment. This is not the best situation for science, but sadly this is how it works in the courts. Signing a waiver that states you are aware you may only get a placebo does not affect this.

Surgery is even worse, since there are risks associated with any surgery. This makes it almost impossible to do placebo controls, although it has been done a few times for minor operations. Much surgery is accepted because if something is wrong, cutting it out seems to be an obvious way to solve it. Unfortunately this may not always be true, and recently some procedures have been brought in to question precisely because they have only been compared with other procedures and never with a placebo.

All the links you posted in response to Yahzi seem to refer to non-fatal diseases and as such your arguments are entirely true. This is not the case that I was arguing, where there are serious ethical problems involved with placebos when a lack of treatment can be fatal.
 
I'll have to respond to that later as I don't have my reference on hand (it's buried in some boxes somewhere). IIRC their sample size was about 25.

However, I've seen very few studies of remote intercessionary prayer that I felt were methodologically sound to begin with, and none that were run the way I'd like to do it.

This is one of our major concerns. It doesn't matter how good your method is if you don't have enough people. 25 is not large enough unless you can absolutely guarantee that the two groups are of similar composition. You have said that you will be able to observe this, but you have not said how you will observe it, or what you will do about it if there is a problem.

Indeed. However, I think that Startz' applied modeling so far shows that this is not a significant (sic) problem. If someone can demonstrate that it is - ie it'll cause a false positive more than 5% of the time - then I'll certainly reconsider.

As digithead has said, Startz's model is not complete. It uses only one variable where many need to be considered, and was set up as a quick example that has not been shown to accurately model what you are proposing. It is not up to us to show that this model is wrong, it is up to you to show that there is nothing else that could affect it.

I've tried to be specific as to the paramaters of the score equation. I can't and won't say in advance what it will be, since it'll be based on the previous round's data (to better tune the equation).

However, I have yet to see anyone point out a real methodological flaw with *any* possible score equation I could come up with that doesn't access the assignments database.

You must explain exactly how you obtain the scores or the whole trial is meaningless. The method must be decided in advance and cannot be changed depending on the results, since this is exactly how bad conclusions can be made from otherwise good trials. The classic statistical error is to take a set of data and analyse it until you find a correlation with something, which is almost always possible. This may not be the case here, but you must show that it is not.

Also please note that there is a pretty small chance of any particular Healer knowing any particular Recipient, and that I will be requiring them to sign something saying they haven't communicated with any Recipient...

Not all people are above fraud. Signing a statement does not mean they mean it. How will you prove that they are telling the truth?


1. I will be tracking how many people the Recipients believe are praying for them. This should be significantly equal between active and control groups.
2. They should also be significantly equal in *actual* measure, since the selection process (randomization) in no way has a potential to bias on that measure.
3. The active group is getting baseline + study; the control group is just getting baseline.
4. Ergo, the active group is getting more than the control group.

As Gr8wight and I have said, this is simply not true. You need very large samples before you can rely on randomisation. If you expect a sample of around 25, as in the study you refered to, this is nowhere near large enough. Also, you must show that randomisation achieves this, whatever your sample size, not just assume it does.

Also I should point out that AT MOST what you are claiming is that this would be a false negative, not a false positive (and even that is a statistically very unlikely claim).

False negative or false positive, the important word is "false".[/QUOTE]

skeptigirl, gr8wight:

I've said repeatedly.... I am testing for the ADDITIVE effect of prayer.

I am not going to try to make sure that the control group is not prayed for at all (that would be impossible as well as unethical). What I can say with certainty is that they aren't getting any extra prayer from their participation, and with statistical certainty that they (as a group) are getting less than the active group.

That diff is what I am testing.[.quote]

The trouble is it is not statistical certainty. You need to demonstrate this, not assume it.

As a subset I will also be doing analysis of other variables I'm tracking - e.g. whether religion, directedness, frequency, etc *correlate* to effectiveness. Same thing for disease outcomes (eg perhaps pain is more affected than survival rate or $ spent in treatment). This part is not intended (at this point) to be a causal study, just a correlative analysis; I'll be using it to construct the score equation & additional prerequisites for participation in later rounds. Those will still be done in the same, causal double blind model; if the correlative data is accurate then one would expect a larger effect demonstrated as it's better tuned.

This is the worst kind of analysis possible. If you gather data from people and then try to find a correlation with anything, you will find one. This is why studies only ever focus on one cause and try to control for all others. Occasionally a strong trend may be noticed that is commented upon and recommended for further study, but a trial that is set up to examine one possible correlation cannot reliably comment on any other.

digithead - I think there's been no question that I have answered your points 1-3 explicitly. Reiterating them is somewhat silly.

Point 1 asked for your measure for the outcome, which you explicitly stated you would not provide and said you would change for different trials. This is not acceptable for a medical trial.

Point 3 asked what diseases you would look at. All you have said is "cancer". This refers to hundreds of different diseases, many of which progress very differently from one another.

Your point #4 is understood, but you haven't demonstrated that Startz' monte carlo simulation of your objection is inaccurate, and it shows that your objection is unfounded.

In the same post digithead said excatly what was wrong with this simulation. Put simply, it is too simple. In any case, it is not up to us to show how it is wrong, but up to you to show that it acurately represents your trial, which is unlikely to be the case.

I heed advice, but not blindly - I get to be a skeptic too, y'know. You have to make your case, and so far you have not. You've just stated that you *think* these confounders would result in a false positive, but you haven't given any monte carlo sims or math to back it up or refute Startz' counter examples.

No, you have to make your case. If someone thinks something could be a problem, it is up to the person running the trial to show that it is not.

IMHO you're trying to make this look "very far away from settled" when in reality it's pretty much a solid protocol with very little tweaking required to finish and be completely sound.

The points raised say that this protocol is far from solid. Even if we assume that your protocol is perfect, you have to show this, which you have hardly even tried to do. I very much doubt the JREF would accept this without raising exactly the same points that we have. At the least they will require you to show that these points are not valid.
 
skeptigirl, gr8wight:

I've said repeatedly.... I am testing for the ADDITIVE effect of prayer.

I am not going to try to make sure that the control group is not prayed for at all (that would be impossible as well as unethical). What I can say with certainty is that they aren't getting any extra prayer from their participation, and with statistical certainty that they (as a group) are getting less than the active group.

That diff is what I am testing.

And I will repeat: if randomisation is your only control, your proposed sample size is way too small. Unless you can bump your numbers up into four digits, I can't see how you can call your results significant.
 
Thanks, Gr8wight, well summarized, I was getting tired of repeating myself...

But I will say it again, Saizai, you are willfully ignoring all of the issues that we have raised so far because you have apparently convinced yourself that your protocol is beyond reproach. Remember what we've told you because you will be shipwrecked by your own hubris...

In any event thanks for the material, I will be using it next time I teach research methods, I'm pretty sure any undergrad can figure out the same issues that some of us have raised against it...

It's time to abandon arguing about the protocol and try to discover why you feel it is so important to test the power of intercessory prayer on disease outcome. Especially when numerous published studies have shown its ineffectiveness. You claim you're agnostic and a skeptic but what is your motivation for doing this study? Because a true skeptic would take the already overwhelming evidence in failing to reject the null to conclude that intercessory prayer has no effect on disease outcome and move on to new matters. You've given us the how now give us the why...
 
Digithead et al.

Nice job trying to point saizai in the right direction. I would agree that the it's time to move to the "why" versus the "how". I may be beating a dead horse here (those live ones sure are hard to catch!), but I think the following quote is telling:

originally posted by saizai said:
They get to see the code before the commencement of round two, and verify that it in no way breaches double blind, installs a back door, accesses a recipient's assignment(s) status, or accesses Healers' data. They do not get to reject a proposed code based on any other reason.

Am I the only one bothered by the that last sentence? It seems rather confrontational (and smacks of distrust) for what is being presented as a plea to "check my methodology." When doing research, one should accept the fact that there are many ways to be wrong- probably more than there are ways to be right. Setting the conditions for which you can be judged as being wrong, a priori, is just not ok. You need to be open to errors being discovered, and willing to correct those errors. Through that statement, and his approach to those trying to help, saizai has indicated that he is not open to corrections or suggestions.
 
Do you see any false negative potential OTHER than what I said (familiarity/contact requirement)? E.g., insufficient information provided to Healer, other requirements, ...

How about:
- prayer only works if a specific diety is addressed (FSM maybe?)
- prayer only works if no one tracks the results
- prayer only works if done by a priest
- prayer only works if you donate heavily to a church
- prayer only works if done on Sunday
- prayer only works if spoken in Latin
- prayer only works if done while standing on your head
- prayer only works if done by 100 or more people

This is where actually believing that it works a certain way comes in handy. You can then narrow down what you think actually matters and find out if you're right. With no belief, there are a great many number of factors you need to control for to make it a really worthwhile test. The "likely" result of "no effect" (given that the only proposed improvement to existing studies is sample size) will at least have some meaning if you had some belief, in that it will encourage you to re-examine it.

I'll have to respond to that later as I don't have my reference on hand (it's buried in some boxes somewhere). IIRC their sample size was about 25.

However, I've seen very few studies of remote intercessionary prayer that I felt were methodologically sound to begin with, and none that were run the way I'd like to do it.

So to avoid their perils, you intend to use a greater sample size than those you've seen that did appear methodologically sound (which seem to have used, at most, 25 people, so perhaps 100 is a good enough target for your first round?) and to address any methodological failings you identified in studies that did use sufficient sample size.

So I suppose if someone were to present an existing study that used 100 or more people, you would identify an error in its methodology that your study will avoid somehow, or you will agree that even the sample size of that study was insufficient, and increase your definition of "too small of a sample size" to include the new study?
 

Back
Top Bottom