Sorry BAC, the first part of my answer ran a little long, but it is releavnt to how a large number of samples and sample sets is needed to determine association.
David, the coin toss samples are completely independent events drawn from a process with the exact same probability of producing a head every single time. Do you think that the likelihoods of quasar/cluster arrangements and redshifts are completely independent of one another?
Now you are using apriori arguments to show your use of a posteriori argument.
The assumption that is the null assumption is that there is random distribution of the objects, that in other words there is no pattern before hand.
And remember that random does not mean evenly distributed. You can take a matrix of 1,000 x 1,000 dots and randomly place one hundred objects in this using the pseudo random generator of you choice or six ten sided dice, or what ever means you want for randomly placing dots in the 2d matrix.
Then place a number of dots in the matrix, clear and place, clear and place, say ten dots, one hundred dots and one thousand dots in three separate runs.
In the ten dot placement the chances of each dot being in a particular coordinate position are very low, .00001, for the 100 dot placement they are still low .0001, in the thousand dot matrix they are still low .001.
Yet here is the thing, patterns can arise from a totally random system, even in the ten dot system you could have dots next to each other from a random process.
Say that a dot is at coordinate position
(X,Y), in the ten dot placement what is the chance that if you have a dot next to that dot but one to the left (X+1,Y), the chance does not change it is still .00001, it does not go up or down because there is a dot at
(X,Y), the probability for each of the spots next to the placed dot is .00001, and there are eight space that are adjacent for the placed dot, each of the adjacent spaces has the exact same chance of placement .00001. Regardless of any prior placements. Now you can look at each possible combination of ten dots in the matrix which is a very high number.
Which is a really big number.
But that does not mean that a particular configuration is more or less likely than any other configuration.
A specific placement of all ten dots in a line has exactly the same chance of occurrence as any other the possible configurations. That is why it is random.
So there is an equal probability of pattern of distribution amongst all the configurations. A pattern with the dots in a line is no less likely that a pattern with the dots dispersed, they are equally likely.
So when you have a hundred dots and a thousand dots, you can get all sorts of patterns and associations, but they are still arising from a totally random process.
What you would have to do it study large numbers of configurations to determine if there is a random distribution or a weighted distribution of some sort.
Say we have two algorithms for determining dot placement.
1. Is random there is no weighting or bias to the distribution.
2. Is weighted in that there is a small possibility that a square next to a dot will receive another dot at a higher rate. Say that the algorithms places dots on the following basis, for each dot after a dot had been placed there is a 10% chance this will be a biased dot, it will be placed near an existing dot within a three square radius, randomly determined for radial distance from the originating dot and which is the originating dot. (So first there is a 1/10 probability of being a biased dot after dot n=1 is placed, then if it is a biased an originating dot is randomly chosen from all existing dots, the biased dot is then placed within three radial square of the selected dot.)
Now say that you are given one configuration for each of the two algorithms, are you going to be able to say which one is from a random placement and which one is from a weighted placement. Not very likely, you do not have enough samples to really tell.
The ability to determine a random placement from the weighted distribution requires multiple samples generated by the algorithms. For a dual sample size of 1 Sn=1, it will be impossible to tell the two apart, because you can not determine a distribution pattern for the algorithm.
It is only with much larger Sn, such as 100, 1000, 10000 that you would be able to detect the difference between the two algorithms with any sort of accuracy.
This is counter intuitive, I understand that, but I am discussing real world things here. With a Sn=1 it would really be hard to distinguish if there was a difference in the algorithms, it is only as Sn gets to be very high that a determination could be made that algorithm number two has a weighted distribution.
Both will exhibit patterns, it is only from the comparison of a large Sn that a determination could be made with any accuracy.
So, this is very comparable to determining if QSO placement is random or not, a visible pattern is equally possible in a single configuration. So a limited SN, say of 25 is not going to give you enough data to say that there is a weighted distribution.
It is only be comparing large Sn, and looking at the patterns between configurations that the weighting will become visible. In that way by having a large SN, the average distance between dots will be noticed to be different between the two algorithms. It will not be apparent except for when the Sn is high enough.