Arp objects, QSOs, Statistics

Just so BAC feels better, Arp and his proponents are not the only astronomers to use statistics in a coarse manner. I was looking at the seminal "Bullet Cluster" paper by Clowe, et. al. http://arxiv.org/abs/astro-ph/0608407

I found it interesting to read the following:

Clowe said:
To explain the measured surface mass density, such filaments would have to be several Megaparsecs long, very narrow, and oriented exactly along the line of sight. The probability of such an orientation for two such filaments in the field is ∼ 10−6. Further, because the two cluster
components are moving at a relative transverse velocity of 4700 km/s compared to the typical peculiar velocities in the CMB frame of a few hundred km/s, the filaments could coincide so exactly with each of the BCGs only by chance. This is an additional factor of ∼ 10−5 reduction in probability.

Isn't this a similar statistical reasoning method to Arps? They certainly don't expound on how they arrive at these probabilities they quote.
 
Isn't this a similar statistical reasoning method to Arps? They certainly don't expound on how they arrive at these probabilities they quote.

I don't see anything wrong with the reasoning. They are comparing two theories - one in which only a very special set of configurations could reproduce the observation they made, and one in which nearly all configurations will reproduce it - and concluding that the data therefore favors the second. Estimating the probability in this case sounds pretty straightforward.

The difference with BAC's argument in this thread (I haven't read Arp) is that there is nothing at all unusual about the QSO redshifts given the 'mainstream' model - he is calculating "P" incorrectly.
 
Here are seven sets of QSO redshift data. Each set is all QSOs within a circle (on the sky) of 60' radius.

At least one set is QSOs 'predominantly along' the minor axis of a bright, low redshift spiral galaxy; the set is centred on the nucleus of that galaxy.

At least one set is QSOs within 20o of a (random) direction; the set is centred on a random point on the sky (away from the ZoA, LMC, SMC, etc).

At least one set is 'mock' QSOs within 20o of a (random) direction; this set (or these sets, if there is more than one) was created by randomly assigning a number of 'QSOs' (between the max and min of all other sets), randomly placing these 'QSOs' within 60' of 'the centre', and randomly assigning each 'QSO' a 'redshift'.

If there is more than one set of QSOs within 60' of a bright, low redshift spiral galaxy, those extra sets may comprise QSOs predominantly along some direction other than the minor axis; however, in no case do any quasars appear in more than one set.

I am providing this data in the blindest possible way, to provide as pure a test of 'the BAC approach' as possible (given such a small dataset). Of course, it is entirely possible that BAC may be able to correctly guess the nature of all seven sets, purely by chance (anyone want to calculate the probability of picking the 'right answers' purely by chance?), so this is only a toy test.

It is also possible that BeAChooser may not return to this thread.

The questions are the same as before: what are the 'probabilities' according to 'the BAC approach' of each of these sets, with respect to 'Karlsson peaks' and 'Amaik peaks'.

Set ONE: 0.0547, 0.58, 0.720, 1.084, 1.12463, 1.41, 1.903, 2.05812, 2.10, 2.244, 2.6068, 2.98, 2.99, 3.0347, 3.14.

Set TWO: 0.467142, 0.520394, 1.08367, 1.21187, 2.58, 2.90609.

Set THREE: 0.150221, 0.352114, 0.369104, 0.64384, 0.84003, 0.86495, 0.947653, 1.09562, 1.20564, 1.56234, 1.67964, 1.97046, 2.04535, 2.05185, 2.24764, 2.99602.

Set FOUR: 0.217353, 0.252693, 0.342904, 0.362706, 0.537829, 1.05378, 1.1014, 1.16468, 1.45072, 1.54642, 1.56929.

Set FIVE: 0.365631, 0.7015, 0.746949, 0.937404, 0.963945, 1.03822, 1.1356, 1.2532, 1.37193, 1.73417, 1.8172, 1.86347, 2.1246, 2.3712.

Set SIX: 0.267251, 0.725625, 0.934802, 1.08559, 1.19983, 1.29071, 1.45628, 1.69317, 1.78238, 2.03665, 2.10012.

Set SEVEN: 0.071838, 0.198392, 0.487895, 0.718565, 0.962793, 1.10984, 1.2344, 1.37592, 1.39097, 1.45664, 1.47775, 1.67076, 1.78836, 1.81664, 1.85583, 1.93462, 2.13762, 2.50297, 4.43636.
 
DRD,

Kudos for running this to ground.

Questions, though: why have you introduced the AMAIK peaks?

And, am I correct in understanding that these were culled from actual QSO data?
 
If you give me a day or so, I'll take a stab at these data sets that DRD presents.

Thanks, DRD, this is fun!
 
DRD,

Kudos for running this to ground.

Questions, though: why have you introduced the AMAIK peaks?
.
To see if there's anything special about 'Karlsson peaks'; if BAC is correct, there should be a screamingly obvious difference between the 'Karlsson peak probabilities' and the 'Amaik peak probabilities'.

Of course, we have far too few data to do a decent test, but as an exercise in trying to understand 'the BAC approach', it might prove interesting ...
.
And, am I correct in understanding that these were culled from actual QSO data?
.
All the data is real, in the sense that these are the reported redshifts of 'QSOs' ... except for the 'mock' (random) set(s) - they are courtesy of a random number generator.
 
Are the AMIAK peaks determined randomly?

Do they have as their root some form of string-theory cosmology, or something?
 
Kudos for running this to ground.

He hasn't "run this to ground", Wangler. DRD is just playing his usual coy games. Notice that he could have just told us the name of the galaxies about which his various sets of quasars are found. He could have linked us to whatever's the source of his so-called "amaik" peaks. He could have said something about the calculation I did for the previous data he introduced and asked that I look at. But then that wouldn't be coy. Or maybe he's just trying to make the thread as long as possible. In any case, I'll not play his game. He can cite his sources for whatever he introduces or I will simply assume he's coyly making things up and ignore him.

In the meantime, here's a look at the results of the case David found: UGC 8584.

Using Arp's first 10 transformed z: 1.44, .62, 1.99, 1.65, 1.40, .58, .57, 2.59, 2.01, .61 with Karlsson z = 0.06, 0.3, 0.6, 0.96, 1.41, 1.96, 2.64 and my formula:

P = (14/3)^10 * 0.03 * 0.02 * 0.03 * 0.25 * 0.01 * 0.02 * 0.03 * 0.05 * 0.07 * 0.01 = 4.6 x 10^-9.

Average expected P if uniform distribution using my equation and three random uniform samples for z:

Case 1: .387, 2.918, 2.406, 1.299, 1.014, 2.283, 2.385, 1.325, 0.529, 1.561

Pu1 = (14/3)^10 * 0.087 * 0.278 * 0.234 * .11 * .054 * .127 * .025 * .085 * 0.071 * .151 = 4.8 x 10^-4

Case 2: 2.913, 2.714, 0.557, 1.393, 1.775, 2.077, 2.003, 0.575, 0.048, 2.321

Pu1 = (14/3)^10 * 0.273 * 0.074 * .008 * .185 * .117 * .043 * .025 * .012 * .32 = 7.1 x 10^-5

Case 3: 2.699, 1.688, 0.976, 1.780, 0.36, 2.581, 0.126, 2.032, 2.213, 0.119

Pu1 = (14/3)^10 * .059 * .272 * .016 * .18 * .06 * .059 * .066 * .072 * .197 * .059 = 4.4 x 10^-5

So it looks an average of about 10^-5 is expected if the distribution is uniform. Yet this observation is 10^-9. Interesting.

How about with the whole set of 23 quasar data points from Arp:

P = (14/3)^23 * 0.03 * 0.02 * 0.03 * 0.25 * 0.01 * 0.02 * 0.03 * 0.05 * 0.07 * 0.01 * 0.06 * 0.03 * 0.21 * 0.1 * 0.01 * 0.23 * 0.05 * 0.26 * 0.1 * 0.17 * 0.13 * 0.13 * 0.04 = 3 x 10^-14

Average based on a uniform distribution should be (for 2 random samples of 23 z):

Case 4: 1.753, 0.986, 1.560, 1.577, 0.892, 0.182, 2.342, 2.341, 1.226, 1.484, 2.168, 1.114, 0.149, 2.793, 1.038, 0.709, 2.507, 2.964, 0.908, 0.00055, 0.439, 2.700, 0.724, 1.067

P = (14/3)^23 * .207 * 0.026 * 0.08 * 0.063 * 0.068 * 0.118 * 0.298 * 0.299 * .185 * .156 * .192 * .151 * .153 * 0.078 * .109 * .137 * .324 * .052 * . 05945 * .139 * .06 * .124 * .107 = 7.8 x 10^-7

Case 5: 2.091, 1.826, 0.187, 0.727, 0.834, 0.922, 2.137, 2.276, 1.756, 1.087, 2.384, 0.707, 2.408, 2.040, 1.889, 2.356, 0.881, 1.526, 1.447, 1.259, 1.142, .419, 0.0591

P = (14/3)^23 * .131 * 0.134 * 0.113 * 0.127 * 0.126 * 0.038 * 0.177 * 0.316 * .204 * .127 * .256 * .107 * .232 * 0.08 * .071 * .284 * .079 * .126 * . 047 * .141 * .182 * .119 * .0009 = 5.6 x 10^-8

So it looks an average of about 10^-8 is expected. Yet this observation is 10^-14. Again interesting.

Still think my formula can't detect something unusual in the data?
 
Are the AMIAK peaks determined randomly?

Do they have as their root some form of string-theory cosmology, or something?
 
He hasn't "run this to ground", Wangler. DRD is just playing his usual coy games. Notice that he could have just told us the name of the galaxies about which his various sets of quasars are found. He could have linked us to whatever's the source of his so-called "amaik" peaks. He could have said something about the calculation I did for the previous data he introduced and asked that I look at. But then that wouldn't be coy. Or maybe he's just trying to make the thread as long as possible. In any case, I'll not play his game. He can cite his sources for whatever he introduces or I will simply assume he's coyly making things up and ignore him.

BeAChooser. Have you ever heard of a double-blind test? It is a common method in statistcal analysis of data where neither the person setting the test nor the person doing the test knows the source of the data. It is not until after the test is done that the source of the data is revealed.

DRD is actually doing a single-blind test since he knows the source of the data.

DRD does state the source of the data: catalogues for the real data and a random number generator for the random data.

I do hope you apply your test to the data. It will remove any question that your test will work regardless of the source of the data. Doing this will remove the chance of someone looking at this thread and assuming that your test only works on data that you select.
 
Are the AMIAK peaks determined randomly?
.
No, the values are very non-random.
.
Do they have as their root some form of string-theory cosmology, or something?
.
No.

Per Reality Check's comment, perhaps it may have been better to develop two sets of extra 'peaks' - the Amaik ones (which are most certainly not random), and another (chosen randomly).

In any case, as I said earlier, the dataset is far too small to 'prove' anything ... however, it should have considerable value in terms of helping regulars (such as me) get a handle on 'the BAC approach', if for no other reason than we can begin to work out how it works, in a blackbox sense (since BeAChooser has decided, for reasons apparently only known to himself, to not crunch the seven+ sets of data, nor answer the earlier set of questions; it was BAC's non-answer to those which triggered my fall-back method of trying to understand his method).

Oh, one other thought: in addition to 'blinding', the inputs I provided avoid (to some extent) the difficulties of a posterori analyses, if only because no one (other than me, at this stage) knows how many (if any) of the 'origins' of the sets fall on galaxies that are the explicit target of earlier Arp (et al.) papers, or not.
 
... snip ...

He could have said something about the calculation I did for the previous data he introduced and asked that I look at.

... snip ...
.
Ah, but my dear BAC, I did say "something about the calculation I did for the previous data" - I said I didn't understand the approach, and I asked a series of questions to try to clarify how it works! :p

Questions which you chose (as AChooser, you Be) to not answer ... along with quite a number of other questions, it seems.

PS: I still don't understand 'the BAC approach'; would you be kind enough to explain it, by going through the data I have provided, and show me (and all other readers) how it works?

Don't be coy now ...
 
In the Arp (et al.?) paper on NGC 5985, there are 5 'quasars' within 60' of the galaxy nucleus, an extrapolated areal density of 1.6 per square degree.

NED lists 20 'QSO' objects within this radius (for NGC 5985), an extrapolated areal density of 6.4 per square degree.

For NGC 3516, Chu et al. have 6 (or 5, depends on definitions) within 24.2', an extrapolated areal density of 13.7 per square degree; NED lists 9 'QSO' objects within 60', an extrapolated areal density of 2.9 per square degree.

L-C&G's paper contains details of the numbers of quasars within 3o of the 71 galaxies in their paper; even over such a big chunk of sky, the calculated areal density varies by more than a factor of two.

For the work I did recently (the seven sets of data), the extrapolated areal density ranges from 14.0 to 42.0 per square degree.

It would seem that the density of quasars, on the sky, varies quite a lot, from place to place in the sky!
 
He hasn't "run this to ground", Wangler. DRD is just playing his usual coy games. Notice that he could have just told us the name of the galaxies about which his various sets of quasars are found. He could have linked us to whatever's the source of his so-called "amaik" peaks. He could have said something about the calculation I did for the previous data he introduced and asked that I look at. But then that wouldn't be coy. Or maybe he's just trying to make the thread as long as possible. In any case, I'll not play his game. He can cite his sources for whatever he introduces or I will simply assume he's coyly making things up and ignore him.
So when someone meets your challenge you fold up, what kind of approach that is. Why? You can do better than that can't you, Oh, I see ...
In the meantime, here's a look at the results of the case David found: UGC 8584.


Hmm... yeah, random chance thats right, from a biased sample no doubt.
From Arp's paper:
In a computer analysis of the 2dF redshift survey, groups of quasars that obeyed the Karlsson values with respect to neighboring galaxies were catalogued in Fulton & Arp (2006) (Paper I). UGC 8584 turned out to have 9 of its nearest quasars fall especially close to the standard Karlsson values

So what does that mean, that he picked the one with the best values from the catalouge?

Isn't that cherry picking?
 
Last edited:
In the Arp (et al.?) paper on NGC 5985, there are 5 'quasars' within 60' of the galaxy nucleus, an extrapolated areal density of 1.6 per square degree.

NED lists 20 'QSO' objects within this radius (for NGC 5985), an extrapolated areal density of 6.4 per square degree.

For NGC 3516, Chu et al. have 6 (or 5, depends on definitions) within 24.2', an extrapolated areal density of 13.7 per square degree; NED lists 9 'QSO' objects within 60', an extrapolated areal density of 2.9 per square degree.

L-C&G's paper contains details of the numbers of quasars within 3o of the 71 galaxies in their paper; even over such a big chunk of sky, the calculated areal density varies by more than a factor of two.

For the work I did recently (the seven sets of data), the extrapolated areal density ranges from 14.0 to 42.0 per square degree.

It would seem that the density of quasars, on the sky, varies quite a lot, from place to place in the sky!


I am shocked to find half of our students are below average!
 
In the meantime, here's a look at the results of the case David found: UGC 8584.

Using Arp's first 10 transformed z: 1.44, .62, 1.99, 1.65, 1.40, .58, .57, 2.59, 2.01, .61 with Karlsson z = 0.06, 0.3, 0.6, 0.96, 1.41, 1.96, 2.64 and my formula:

P = (14/3)^10 * 0.03 * 0.02 * 0.03 * 0.25 * 0.01 * 0.02 * 0.03 * 0.05 * 0.07 * 0.01 = 4.6 x 10^-9.

Let's take a look at that. Those 10 measured z values are far from uniformly distributed from 0 to 3 - that much is clear at a glance. If we group them in 6 bins - from 0 to .5, .5 to 1, etc. - we see that there are 4 between .5 and 1, 2 each in the next two bins, and 1 each in the remaining two.

So here's a hypothesis: 40% of all QSO redshifts are distributed randomly between .5 and 1, 20% between 1 and 1.5, 20% between 1.5 and 2, and 10% between 1 and 1.5 and 1.5 and 2.

OK, let's compare my hypothesis to BAC's using his methods. I'll generate a random set of 10 values with the distribution I hypothesized and compute BAC's P for it.

For brevity, I'll just tell you that the answer was 5 10^-10, compared to the real data, which was only 5 10^-9! (See the spoiler for the numbers.)

Conclusion: the z values of QSOs in this data set are probably not uniformly distributed in z from 0 to 3. A mild assumption on the distribution produced a smaller P value than the data, indicating there is no correlation with the Karlsson peaks (beyond the fact that the data is more concentrated in the region where the Karlsson peaks are closer together). Such a distribution could arise for many reasons having to do with the way these data are collected.

In[19]:= Table[.5 + .5 Random[], {n, 1, 4}]

Out[19]= {0.688276, 0.710341, 0.903617, 0.726767}

In[20]:= (.6 - .688276) (.6 - .710341) (.96 - .903617) (.726767 - .6)

Out[20]= 0.00006962

In[21]:= Table[1 + .5 Random[], {n, 1, 2}]

Out[21]= {1.30142, 1.00788}

In[22]:= (1.41 - 1.3014225295311879`) (.96 - 1.0078755031892541`)

Out[22]= -0.0051982

In[24]:= Table[1.5 + .5 Random[], {n, 1, 2}]

Out[24]= {1.79155, 1.84984}

In[29]:= (1.96 - 1.7915467797055513`)*((1.8498436787838968` - 1.96))

Out[29]= -0.0185562

In[30]:= 2 + .5 Random[]

Out[30]= 2.37338

In[31]:= 2.5 + .5 Random[]

Out[31]= 2.95512

In[32]:= (2.3733784330159726` - 2.64) (2.955116134453641` - 2.64)

Out[32]= -0.0840168

In[33]:= 0.08401675754997921` * -0.018556187044642278` * \
0.00519820103371076` * 0.00006961998960798804`

Out[33]= -5.64211*10^-10
 
Last edited:
.First, here are the redshifts of the 33 'QSOs' NED lists as being within 30' of a bright, low redshift spiral galaxy (z = 0.00411):

0.159193, 0.162, 0.336373, 0.421634, 0.607041, 0.7359, 0.7361, 0.777445, 0.7954, 0.8798, 1.04486, 1.0928, 1.2489, 1.2649, 1.371, 1.4574, 1.4791, 1.5363, 1.5962, 1.61306, 1.6569, 1.73016, 1.7623, 1.8089, 1.841, 1.9192, 1.9593, 1.9728, 2.13761, 2.19801, 2.2645, 2.49, and 2.57.

Next, here are the quasars that lie 'predominantly along' a preferred direction (one such is the 'minor axis', the other is a different 'preferred direction'; I'm not saying which is which):

a) 0.162, 0.336373, 1.371, 1.4791, 1.61306, 2.13761, 2.19801

b) 0.607041, 0.8798, 1.04486, 1.73016, 1.841, 1.9192, 2.57.

Alright, my results for the three data sets shown above, compared to the Karlsson values are:

Group, n, P
All QSO,33, 5.70217E-15
Group a, 7, 0.001699212
Group b, 7, 0.011957018

So, I am not sure what this means.

My definition of the results:

The probability of 33 arbitrary QSO z's being distributed as close to the Karlsson values as the 33 QSOs listed by DRD is 5.7x10-15.

The probability of there being arbitrary QSO z's being distributed as close to the Karlsson values as the Group a QSOs is 1.7x10-3.

The probability of there being arbitrary QSO z's being distributed as close to the Karlsson values as the Group b QSOs is 1.2x10-2.

I think that the large n case is meaningless; of course finding 33 QSO redshifts with this particular distribution related to the Karlsson values is going to be astronomical.

For the Group a and Group b cases, I guess that Group a is more significant, statistically, than Group b.

So, if we agree with the Arp and BAC hypothesis, the Group a QSOs are probably the group aligned along the minor axis.
 
Last edited:
Alright, my results for the three data sets shown above, compared to the Karlsson values are:

Group, n, P
All QSO,33, 5.70217E-15
Group a, 7, 0.001699212
Group b, 7, 0.011957018

So, I am not sure what this means.

My definition of the results:

The probability of 33 arbitrary QSO z's being distributed as close to the Karlsson values as the 33 QSOs listed by DRD is 5.7x10-15.

The probability of there being arbitrary QSO z's being distributed as close to the Karlsson values as the Group a QSOs is 1.7x10-3.

The probability of there being arbitrary QSO z's being distributed as close to the Karlsson values as the Group b QSOs is 1.2x10-2.

I think that the large n case is meaningless; of course finding 33 QSO redshifts with this particular distribution related to the Karlsson values is going to be astronomical.

For the Group a and Group b cases, I guess that Group a is more significant, statistically, than Group b.

So, if we agree with the Arp and BAC hypothesis, the Group a QSOs are probably the group aligned along the minor axis.
Would you mind showing your working for the Group a and Group b calculations please?

Which of BAC's formulae and inputs did you use?
 
Let's take a look at that.

Disregard my post above - I forgot to normalize the probabilities properly, so my number is not correct. I don't have time to fix it now.

Anyway, I'm still waiting for BAC to answer my question: how low does P have to be before we can exclude the hypothesis that the QSOs are uniformly distributed from 0 to 3 with 95% confidence? Then, how low does P have to be before we can exclude the 'mainstream' model (which is NOT that QSOs in that data set are uniformly distributed). Finally, what model are we comparing the mainstream model to?
 

Back
Top Bottom