This is one of our major concerns. It doesn't matter how good your method is if you don't have enough people. 25 is not large enough unless you can absolutely guarantee that the two groups are of similar composition. You have said that you will be able to observe this, but you have not said how you will observe it, or what you will do about it if there is a problem.
1. I'm having >25.
2. I observe by the very simple expedient of asking.
No study has control and active groups that are
exactly the same. They just draw from the same pool and make sure the study itself doesn't create differences. It's statistically unlikely (by definition, in fact) for a difference to exist - there'll be one p*100% of the time, for whatever p level you choose.
As digithead has said, Startz's model is not complete. It uses only one variable where many need to be considered, and was set up as a quick example that has not been shown to accurately model what you are proposing. It is not up to us to show that this model is wrong, it is up to you to show that there is nothing else that could affect it.
Proving a negative is impossible. You claim there is a
real problem, therefore you have the much simpler task of proving a positive.
Just handwaving a claim that it's insufficiently controlled isn't good enough, especially when the monte carlo sim shows otherwise.
You must explain exactly how you obtain the scores or the whole trial is meaningless. The method must be decided in advance and cannot be changed depending on the results, since this is exactly how bad conclusions can be made from otherwise good trials. The classic statistical error is to take a set of data and analyse it until you find a correlation with something, which is almost always possible. This may not be the case here, but you must show that it is not.
Quite so. But did you completely ignore the places where I said that I would be determining the score equation
in advance of obtaining data?
Not all people are above fraud. Signing a statement does not mean they mean it. How will you prove that they are telling the truth?
They don't know what group they are in, therefore they have no way to lie in a way that would influence the results.
As Gr8wight and I have said, this is simply not true. You need very large samples before you can rely on randomisation. If you expect a sample of around 25, as in the study you refered to, this is nowhere near large enough. Also, you must show that randomisation achieves this, whatever your sample size, not just assume it does.
Not so, sorry. I've seen plenty of robust studies with N ~= 25 that still manage p<.05 or <.01 or <.001 even. Depends on the distribution of the measure in the pool. In any case it's self-correcting: you can't obtain a p<.05 with a too-small pool. Simple enough.
False negative or false positive, the important word is "false".
Only if you're concerned with defending
a believer's perspective. If you're just concerned with protecting the challenge against fraud, then false positives are it.
This is the worst kind of analysis possible. If you gather data from people and then try to find a correlation with anything, you will find one. This is why studies only ever focus on one cause and try to control for all others. Occasionally a strong trend may be noticed that is commented upon and recommended for further study, but a trial that is set up to examine one possible correlation cannot reliably comment on any other.
Again you don't seem to have read what I wrote very carefully.
Correlational aspects are only going to be used to inform the design parameters of the next round(s). That is consistent with standard scientific method. What is being
tested is the causal.
Point 1 asked for your measure for the outcome, which you explicitly stated you would not provide and said you would change for different trials. This is not acceptable for a medical trial.
I said I would provide it before each trial in question. This is perfectly acceptable.
In any event thanks for the material, I will be using it next time I teach research methods, I'm pretty sure any undergrad can figure out the same issues that some of us have raised against it...
Glad you enjoy reading.
However, I should point out that everything I have written is my copyright and I explicitly do not grant you any rights to use it in any manner whatsoever.
It's time to abandon arguing about the protocol and try to discover why you feel it is so important to test the power of intercessory prayer on disease outcome. Especially when numerous published studies have shown its ineffectiveness. You claim you're agnostic and a skeptic but what is your motivation for doing this study? Because a true skeptic would take the already overwhelming evidence in failing to reject the null to conclude that intercessory prayer has no effect on disease outcome and move on to new matters. You've given us the how now give us the why...
I'm not interested in discussing my motivation beyond what I have already stated: curiosity as a true (weak) agnostic. I decline to get dragged into an argument about theology, philosophy, and the like.
Am I the only one bothered by the that last sentence? It seems rather confrontational (and smacks of distrust) for what is being presented as a plea to "check my methodology." When doing research, one should accept the fact that there are many ways to be wrong- probably more than there are ways to be right. Setting the conditions for which you can be judged as being wrong, a priori, is just not ok. You need to be open to errors being discovered, and willing to correct those errors. Through that statement, and his approach to those trying to help, saizai has indicated that he is not open to corrections or suggestions.
I'm quite open to correcting real errors.
I'm not open to "correcting" things that aren't really errors, or that are merely whims. I am only interested in your input insofar as it ensures that my methodology is tight. No score equation I can possibly choose, within the parameters I gave, would be a methodological flaw - and therefore I make it explicit that I can choose whatever I want.
This is for the simple reason of ensuring that everything in the application is totally explicit so there is no arguing later about what the terms are.
How about:
- prayer only works if a specific diety is addressed (FSM maybe?)
- prayer only works if done by a priest
- prayer only works if done on Sunday
- prayer only works if done by 100 or more people
Tracked correlatively. If it's the case then this info will be used to filter later rounds' participants.
- prayer only works if no one tracks the results
Inherent flaw in the design and indeed in all possible designs I can think of. Acceptable.
- prayer only works if you donate heavily to a church
- prayer only works if spoken in Latin
- prayer only works if done while standing on your head
Not tracked. If so, oh well, my miss.
This is where actually believing that it works a certain way comes in handy. You can then narrow down what you think actually matters and find out if you're right. With no belief, there are a great many number of factors you need to control for to make it a really worthwhile test. The "likely" result of "no effect" (given that the only proposed improvement to existing studies is sample size) will at least have some meaning if you had some belief, in that it will encourage you to re-examine it.
You cannot, in principle, explicitly control for all possible factors. It's simply impossible by definition. That's what randomization is for.
So I suppose if someone were to present an existing study that used 100 or more people, you would identify an error in its methodology that your study will avoid somehow, or you will agree that even the sample size of that study was insufficient, and increase your definition of "too small of a sample size" to include the new study?
I haven't seen this hypothetical study, therefore I cannot comment.
One further improvement I have thought of:
I'll set an arbitrary score equation for the first round - essentially a random guess. This allows JREF to participate in the first round as well as the second and third.
If the first round is positive, then we go directly to the third as the 'final test'; if not, then we go to the second round as the new 'preliminary test' with a score equation based on the actual data gathered in the first round.