Arp objects, QSOs, Statistics

Dancing David said:
Hiya BAC, i have two sets of double numbers from 1-100
I'm not going to waste my time looking at this because it has no relation to the actual methodology I used, David. If you want to engage me in further debate, then do the relatively simple Monte Carlo analysis I suggested to you. If you aren't sure what you have to do, then see the post I made to DRD a few posts back. You made specific claims about what my model would show and the only way to prove your claims is to do this analysis. If you aren't willing to do that, we can only suspect that you know you were wrong or you aren't capable of doing that simple Monte Carlo analysis.
.
I must say you certainly are full of surprises BeAChooser! :p

As I understand Dancing David's questions, and the one in the post you are quoting in particular, he goes to the heart of your approach - how to distinguish a causal relationship from a chance one ... solely on the basis of the kind of calculation you have presented?

If, however, you are not claiming that your approach can make such a distinction, then I (for one) have (once again) mis-understood what you have written (big, big time).

If you are making such a claim, then DD's test is a good one, and I'd have expected you'd have welcomed the chance to show everyone just how powerful your approach is.

In another thread, ben_m replied to something someone else wrote as follows:
This does NOT happen:
Q) "Bob, I understand that MOND fits rotation curves, but there are similarly slow accelerations in disk oscillations; why do those look so Newtonian? Doesn't that disprove your theory?"
A) "That's a great question, Professor Zwicky. In fact, you could say that the answer to that question would allow the theory to be conclusively disproven. The answer is presumably in my peer reviewed paper, but I'm not doing your work for you."
Q) "Oh, OK."

This happens---well, assuming that the speaker isn't full of baloney.
Q) "Bob, I understand that MOND fits rotation curves, but there are similarly slow accelerations in disk oscillations; why do those look so Newtonian?"
A) "That's a great question, Professor Zwicky. We looked into that, and the data actually disagree with the old version of the MOND, that's why we're presenting the non-Lorentz-invariant version. For this version, the fit is actually really good."
Q) "I still find it hard to believe; do you have a slide of that?"
A) "It's in the paper; let me pull up a PDF and show you."

Your smug refusal to estimate the forces on a star does not suggest that we're going to give you, and the IEEE peer review system, the benefit of the doubt. Your refusal to estimate the forces tells us that you're full of it. You don't know the forces; if you did know the forces, you would know that they disprove your theory. I looked for a force estimate in Peratt's papers---Peratt didn't estimate them, he explicitly assumed they were large. If you know that the forces are large, it's time to stand up and tell us how you know. Show your work, with units. Get on it.
.
In this thread, you have certainly shown your working (good), but it seems that no one (other than yourself) actually understands how that working leads to the conclusions you so firmly, and repeatedly, state.

And this is despite you being kind enough to spend time answering lots and lots of questions about your approach.

BeAChooser, one way to learn about something is to try it in a different place, or in a different way.

I suggested doing this with a mock, 2D universe, populated by only AGs and Qs.

Dancing David has suggested doing this with some dummy numbers, a blind test that is a pale shadow of the kind of thing that is supposed to happen when you develop a hypothesis for testing.

And there may be other examples.

In both cases you have either ignored the suggestion (my mock universe) or refused to cooperate; why?

Let's test this in a different way: would any reader of this thread (other than BAC) who thinks they understand the approach he has used, and how the probabilities he calculated lead to the conclusions he has stated, please say so? If you do understand it, would you be prepared to help BAC out, by answering some of the questions that are, as yet, unanswered?
 
BAC please don't split threads, keep comments here to this thread and not the Hoyle-Narlikar thread.

Here are your posts and responses
http://www.internationalskeptics.com/forums/showpost.php?p=3640215&postcount=44
Say Wrangler ... ask Sol to come comment on my calculations regarding the probability of seeing certain quasar/galaxy associations.

http://www.internationalskeptics.com/forums/showpost.php?p=3640238&postcount=45

http://www.internationalskeptics.com/forums/showpost.php?p=3640335&postcount=46
No, I'd like him (and you, for that matter) to comment on this thread: http://www.internationalskeptics.com/forums/showthread.php?t=107779&page=9 , starting with posts #328 and #329. And maybe this time, Sol will actually understand what I'm calculating. His last comment was out in left field showing little understanding and, frankly, struck me as handwaving.

http://www.internationalskeptics.com/forums/showpost.php?p=3641560&postcount=49

http://www.internationalskeptics.com/forums/showpost.php?p=3641608&postcount=50
Oh, and by the way - I was in the mood for some wholesome family entertainment, so I took a look at BAC's post:


Originally Posted by BeAChooser
To find the probability of a set of r specific values picked randomly from a distribution of n different values, we actually need to ratio the number of ways one can pick those r values from the distribution by the number of ways one can pick any r values from the distribution. Right?

For example, if we have a distribution with 5 possible values (call them a,b,c,d,e) and we want the probability of seeing c and d show up in a random draw of 2 values from that pool of 5 possibilities, we first need to find the number of ways we can draw c and d. Well that turns out to be r!, so the answer is 2 in that case.

Next, we need to divide by the number of ways one can draw ANY 2 values from the 5 possibilities. Note that drawing that value does not eliminate it from the pool. The formula to use here is nr. So there are 52 = 25 ways of drawing 2 values from a pool containing 5 different values.

So the probability of seeing c and d in a single observation in the above example is 2/25 = 0.08 = 8 percent.

So the formula I should have used in my calculation for the probability of seeing r specific values of z picked randomly from a distribution of n different values of z is

P = r!/nr.
As anyone with even basic mathematical competence (that bothers to read this trash carefully enough) can see immediately, this formula is totally wrong. r! grows much faster with r than n^r, so for large r, these "probabilities" become larger than 1. For a simple example, suppose there were only 2 possible values, heads and tails, and we wanted to know the odds of getting HHTT. According to BAC, that's 4!/2^4=1.5. Oops!*

Of course it's much worse than that - as I keep pointing out, BAC is using false a posteori reasoning from the very beginning. His way of applying his wrong formula is a good example - he asks what the odds are of getting quasar redshifts close to some particular set of values, and finds that it's small. But he doesn't ask for the odds of finding some other set of redshifts which are close to some other set of those values - and yet, had that other set been the data, he would have claimed it to be equally unlikely and hence equally as significant.

*This formula is correct for something - namely, the odds of drawing r distinct values out of n - but that is not what BAC needs, nor is it how he uses it. There is no reason why two quasars can't have the same redshift, and in fact there is a case where they do in this same post.

http://www.internationalskeptics.com/forums/showpost.php?p=3641660&postcount=51
Ah but you missed BAC's post where he actually did take a different set of inputs, and calculated the 'probability' (per his method); the result was a (small) number higher than the result from 'his' method ... from this he confidently concluded that 'his' (a posterori) configuration is enough to disprove a key aspect of LCDM cosmology (or at least provided sufficient grounds for justifying his claims that all but a tiny handful of astronomers, astrophysicists, etc are essentially blind dolts).

http://www.internationalskeptics.com/forums/showpost.php?p=3641689&postcount=52

Originally Posted by DeiRenDopa
Ah but you missed BAC's post where he actually did take a different set of inputs, and calculated the 'probability' (per his method); the result was a (small) number higher than the result from 'his' method ...
His formula is just wrong, so there's not much point in reading any other of his posts. I've had enough entertainment for now.

In any case, the correct way to do this would be to take every possible data set, calculate the "significance" of each (using whatever formula you choose - BAC's wrong one, for example), and then ask what fraction of those possible data sets are as significant or more than the actual data. That fraction is the true significance of the data with respect to that formula. (Of course there are much simpler approximations to that procedure one almost always can use in cases where the formula is correct and actually means something.)

In this case that wold have to be done for the full sample of all quasars, not just those near a few cherry-picked galaxies. And even then, one must have a theory to compare to, and one must be extremely careful with systematic biases in the data sets.


Please do not crosspost to other threads, I believe that is a distraction.

Thank you.
 
BAC, I would like to clarify two seperate issues:

As I understand it there are two sepearte hypothesis, and if i am in error then please explain the hypothesis to me:

a. Arp hypothesis that there are QSOs which are associated with galaxies, in that they are in optical proximity and that the redshifts for the QSOs are not related to the cosmological redshift of standard theory. Because of the optical proximity or conjuntion of galactic features and QSOs.
a. i. -This association comes from the distance of the QSO to the galaxy or an apparent overlap to galactics structures.

b. There is a further association of some QSOs to the 'minor axes' of some galaxies
b. i. - This association is determined through a +/- degree measure of 7.5 degrees and 182.5 degrees, so that thyere is a 15 degree area on either side diametrically opposed where these QSOs occur.
b. ii.- There is a further association of the Karlson peaks where QSOs have a redshift that cluster around certain values.


Regards

b. i. - is there an orientation to the minor axes, I admit I am confused, is it always oriented to the flat plane of the galaxy (in a specific radian?) or can the double opposed 15 degrees appear in any orientation?

b. ii. is there a site you recomend on Karlson peaks? (For the slower in the crowd, namely me)
 
Last edited:
JREF forum member Beth has posted a concise statement concerning hypothesis testing, how to go about it, what the null hypothesis should be, and so on.

Here is the core part:
The way statistical hypothesis testing works is that you define the null as the hypothosis you assume true and wish to reject in favor of the alternative hypothesis. In addition, the mathematics behind it require that the null hypothesis contain the equality while the alternative hypothesis is a strict inequality. (It has to do with computing probabilities of closed sets versus open sets.)

You decide on a test statistic for your data and compute the probability of observing as test statistic of that size under the assumption that the null is true. Then you can reject the null if the probability of the observed values of the test statistic (p) is smaller than alpha. You can accept the alternative as being true with a confidence of 1 - p.

The reason that we do it this way (which seems backwards to many people) is that it allows us to specify the level of confidence we want to have before accepting the alternative. When we do, we can be very confident we are making a correct decision. However, when we fail to reject the null, the probability that we are making the correct decision can be much much lower.
.
BeAChooser, you might like to consider inviting her to review your material. This might be particularly good as:

* she seems to have not read this thread

* she seems to be familiar with the details of hypothesis testing

* what little she seems to have posted little on astronomy or cosmology, at least this year, has been in the form of very good questions.
 
BAC please stop cross posting to the other thread, your comments and the responses belob here. Not that you will respond to them in a reasoanable manner. Please stop the cross posting. :)


http://www.internationalskeptics.com/forums/showpost.php?p=3641689&postcount=52
Originally Posted by DeiRenDopa
Ah but you missed BAC's post where he actually did take a different set of inputs, and calculated the 'probability' (per his method); the result was a (small) number higher than the result from 'his' method ...
His formula is just wrong, so there's not much point in reading any other of his posts. I've had enough entertainment for now.

In any case, the correct way to do this would be to take every possible data set, calculate the "significance" of each (using whatever formula you choose - BAC's wrong one, for example), and then ask what fraction of those possible data sets are as significant or more than the actual data. That fraction is the true significance of the data with respect to that formula. (Of course there are much simpler approximations to that procedure one almost always can use in cases where the formula is correct and actually means something.)

In this case that wold have to be done for the full sample of all quasars, not just those near a few cherry-picked galaxies. And even then, one must have a theory to compare to, and one must be extremely careful with systematic biases in the data sets.


http://www.internationalskeptics.com/forums/showpost.php?p=3642140&postcount=54
Originally Posted by sol invictus
BeAChooser wrote: So the formula I should have used in my calculation for the probability of seeing r specific values of z picked randomly from a distribution of n different values of z is P = r!/nr.

As anyone with even basic mathematical competence (that bothers to read this trash carefully enough) can see immediately, this formula is totally wrong.
ROTFLOL! Wrong, sol. That is, in fact, the correct formula for the calculation I described.


Originally Posted by sol invictus
r! grows much faster with r than n^r, so for large r, these "probabilities" become larger than 1.
Except you overlooked one thing ... r<=n . So this formula is ALWAYS less than 1 for any combination of r and n that fit that constraint ... a constraint which clearly applies to the calculation I set forth on the other thread. So all you've done here, sol, is demonstrate that you didn't even bother to read the methodology I described and you don't have a clue what you are talking about.

I would ask that if you have further comments about my calculation ... like this one, you post them on the thread where the calculation is presented and discussed so foolish criticism like this can be seen in the proper context. And not disrupt this thread. And by the way, your comment regarding my "use" of a posteriori reasoning is equally bogus as has been explained on that other thread as well. No need to do it here.

http://www.internationalskeptics.com/forums/showpost.php?p=3642214&postcount=55
Originally Posted by BeAChooser
ROTFLOL! Wrong, sol. That is, in fact, the correct formula for the calculation I described.
Nope. That's not how you used it.


Quote:
Except you overlooked one thing ... r<=n .
Nope. Go re-read your own post. In your second example, you put two redshifts into the same "bin" (i.e. two are closest to the same reference value). The moment you do that, the formula is wrong. (Moreover, in Arp's "model" there is no reason why r cannot be greater than n, or why two quasars cannot correspond to the same peak. So your formula is wrong both for the "model" you wanted to study and given the way you applied it.)

Here it is:
Quote:
In this case, observed z = 0.69, 0.81, 1.90, 1.97, 2.13 according to http://articles.adsabs.harvard.edu//...00006.000.html . With Karlsson z = 0.06, 0.3, 0.6, 0.96, 1.41, 1.96, 2.64 , the spacing to the nearest Karlson values are +0.09, -0.15, -0.06, +0.01 and +0.17.
Both z=1.90 and z=1.97 are assigned to the Karlsson z=1.96 slot. So you're simply picking r values out of n, not r distinct values out of n.

Of course as I said, this is also wrong for a much more basic reason - the probability you get from this is NOT the significance. To get that you must calculate the probability of every possible data set you'd consider equally or more significant, and normalize with respect to that.

--------------------------------------------------------------------------------

http://www.internationalskeptics.com/forums/showpost.php?p=3642438&postcount=56
Originally Posted by sol invictus
Quote:
Except you overlooked one thing ... r<=n .

Nope.
Wrong. r is ALWAYS <= n in the methodology used in my calcualtions so that formula will NEVER be > 1. You are simply wrong, sol.


Originally Posted by sol invictus
In your second example, you put two redshifts into the same "bin" (i.e. two are closest to the same reference value). The moment you do that, the formula is wrong.
Wrong again, sol. The formula nr specifically applies in problems where the samples are returned to the population from which they are taken before drawing the next one. That part of the formula allows multiple draws of the same number. And the numerator has nothing to do with what the values of the samples are, only the total number of sample drawn. And the methodology (which you clearly didn't bother to understand) is such that r is ALWAYS <= n ... so you are simply wrong, and not wise enough to know when to quit before embarrassing yourself further.


So there you are BAC, the posts are here for you to ignore in this thread as well.


So how do your statistics determine a causal placement from a random one?

Will you answer that and kindly answer the questions I had above about Arp's hypothesis and karlson peaks?
 
Last edited:
... snip ...

DeiRenDopa said:
In fact, I'd challenge any reader of this thread to repeat the kind of 'probability estimates' you have come up with, using a different set of input data ....
Wow! Insulting our readers. Come on DRD ... you don't think our readers are capable of

- coming up with a random set of r values of z between 0.00 and 3.00 to replace the ones I used as observations in one of the calculations I did?

- finding the difference between those z values and the nearest Karlsson values to each?

- doubling those differences to get a set of increments?

- dividing 3.0 by those increments to get a set of n?

- finding the probably with this formula: P = r! * (1/n1)*(1/n2)* ... snip ... * (1/ni=r) ?

- Comparing that probability to the one I got for that case using the observed z?

You should have more faith in their and your abilities. :D
.

It seems I was insufficiently clear.

Here's an example of what I mean by different inputs:

Which galaxy/xies do I choose to run the calculations on? How do I know if I've chosen the 'right' galaxy (or kind of galaxy)? Can I choose any old low redshift galaxy? or must it be a galaxy found in one of Arp's papers?

Having chosen a galaxy, how far out do I look for quasars? 30'? 60'? 90'? more? How do I decide how far out I should look, in general?

Having decided how far out to go, how do I decide which objects within the circle to choose? Only those which are BSOs on Palomar Schmidt plates AND are x-ray sources? how about an AGN which is in a type 2 Seyfert? or a type 2 quasar? a BL Lac object?

Having selected my set of 'quasars', how do I decide which ones are 'on' the minor axis of my chosen galaxy? Do I say the cutoff is 45o (the criterion L-C&G used)? or is it 25o (what we might infer from reading some of the Arp et al. papers)? Or can I arbitrarily select a criterion?

Having selected my 'minor axis quasars', how do I calculate which Karlsson peak each is 'near'? Do I use one of Karlsson's papers for those peaks? or one of Arp's? In calculating 'distance' from a peak, what do I do if the redshift is 'near' the midpoint between two peaks? How do I incorporate stated observational uncertainty in the published redshifts?

So you see, BAC, there are a lot of things one must decide before even starting any calculation ... and I freely confess to be not knowing how to make any of the decisions in the chain I briefly outlined above.

And to reinforce my point: if you, BAC, are the only person who can say whether the inputs have been selected correctly or not (before a calculation begins), how can your approach be said to be objective?
 
Trial 2


Set 1
(50,61)(18,85)(51,32)(77,41)(4,32)(54,41)(50,98)(75,7)(86,40)(26,52)

Set2
(77,35)(11,78)(93,81)(66,81)(44,62)(63,18)(17,43)(19,62)(4,65)(39,25)
 
OK - responding here:

Wrong again, sol. The formula nr specifically applies in problems where the samples are returned to the population from which they are taken before drawing the next one. That part of the formula allows multiple draws of the same number. And the numerator has nothing to do with what the values of the samples are, only the total number of sample drawn. And the methodology (which you clearly didn't bother to understand) is such that r is ALWAYS <= n ... so you are simply wrong, and not wise enough to know when to quit before embarrassing yourself further.

Nope.

Let's do an example again. Suppose there are only two possible values, H and T (heads and tails, say). Now, you say we're allowed to draw the same value twice (like flipping a coin). So, what are the odds of getting HH?

According to you, it's 2!/2^2 = 1/2. But that is obviously wrong - there are four possible outcomes, HH, HT, TH, and TT, so HH has probability 1/4, not 1/2.

As I already explained to you twice, the reason it's wrong is that formula applies only when the values are distinct (the odds of getting one H and one T - so HT or TH - is indeed 1/2). Get it?

This is very clear and very basic. You've made a rather stupid and fundamental mistake at the very first step in your calculation. Sadly for you, every other step is wrong too, because the whole approach is wrong for a much more important reason. But this is enough by itself to disqualify anything you say about probabilities or statistics from any consideration, as it illustrates a severe lack of comprehension of the basics of the subject.
 
Last edited:
I am having trouble understanding the period thing applied to redshifts of QSOs. like in this paper
:http://www.datasync.com/~rsf1/qso-rsa.htm
They devide the QSOs into slices based upon magnitude, but there is no real explanation of why or what might effect magnitude.

they don't even give what sort of factors might effect magnitude or why they chose the slice width that they did.

I don't get it, what does the magnitude have to do with it? It may not even be a relaiable indicator of distance.
 
.

It seems I was insufficiently clear.

Here's an example of what I mean by different inputs:

Which galaxy/xies do I choose to run the calculations on? How do I know if I've chosen the 'right' galaxy (or kind of galaxy)? Can I choose any old low redshift galaxy? or must it be a galaxy found in one of Arp's papers?

Having chosen a galaxy, how far out do I look for quasars? 30'? 60'? 90'? more? How do I decide how far out I should look, in general?

Having decided how far out to go, how do I decide which objects within the circle to choose? Only those which are BSOs on Palomar Schmidt plates AND are x-ray sources? how about an AGN which is in a type 2 Seyfert? or a type 2 quasar? a BL Lac object?

Having selected my set of 'quasars', how do I decide which ones are 'on' the minor axis of my chosen galaxy? Do I say the cutoff is 45o (the criterion L-C&G used)? or is it 25o (what we might infer from reading some of the Arp et al. papers)? Or can I arbitrarily select a criterion?

Having selected my 'minor axis quasars', how do I calculate which Karlsson peak each is 'near'? Do I use one of Karlsson's papers for those peaks? or one of Arp's? In calculating 'distance' from a peak, what do I do if the redshift is 'near' the midpoint between two peaks? How do I incorporate stated observational uncertainty in the published redshifts?

So you see, BAC, there are a lot of things one must decide before even starting any calculation ... and I freely confess to be not knowing how to make any of the decisions in the chain I briefly outlined above.

And to reinforce my point: if you, BAC, are the only person who can say whether the inputs have been selected correctly or not (before a calculation begins), how can your approach be said to be objective?


Enquiring minds want to know!
 
Sol, I finally see what you are saying in your example. Ok. Obviously, my formula isn't quite right. But since I don't see anyone stepping up to the plate to offer a substitute and I'd still like to see how this calculation turns out, let me try again. You know what they say ... third times a charm. Hopefully, you will approve of this revision. :)

First, the probability of any given quasar z value being a certain deltaz from a specific z value in the range z=0 to 3, assuming the quasar z comes from a uniform distribution of z, is 1/(3/(2*deltaz)) = (2*deltaz)/3). Do you agree?

Now in the problem I'm trying to investigate there are 7 specific values (called Karlsson values) in the range 0 to 3. In which case, it seems to me that the probability of any given quasar z value being within a certain deltaz from the nearest specific (Karlsson) z value is (2*deltaz*7)/3. Would you agree?

And since quasar z values are assumed independent in the mainstream theory (i.e., they have no connection to one another), the probability of seeing r quasars around a galaxy all have z values within a given deltaz of a Karlson value is ((2*deltaz*7)/3)r. Agree?

But since the quasar probabilities are independent and the observed quasar z have different deltaz to the nearest Karlsson value, I propose that the total probability can be found by separating each of the data points, finding the probability like I did above given it's specific deltaz and then multiplying their individual probabilities together. In other words, the probability of seeing r quasars around a galaxy all have z values within the distance they are to their nearest Karlsson value is

P = (2*deltaz1*7)/3 * (2*deltaz2*7)/3 * ... snip ... (2*deltazi=r*7)/3 = (14/3)r * (deltaz1 * deltaz2 * ... * deltazi=r)

Would you agree with that new formula, sol?

Now let's try using this new formula with each of the cases.

For NGC 3516, there are r=5 quasars at deltaz of 0.03, 0.09, 0.03, 0.01 and 0.14. So the new Pnew = 2213 * 0.03 * 0.09 * 0.03 * 0.01 * 0.14 = 0.000251 . Previously I calculated P = 0.000002 without weighting factors. So it's quite a bit larger than before. Let's ignore weighting this first go around since I'm not sure my weighting method has all that much merit so it probably needs to be revised too. If I now multiply 0.000251 by the number of galaxies with 5 quasars that I estimated in my methodology,

PNEWzTotalNGC3156 = 0.000251 * 5433 = 1.4

Meaning that if we were to look at what I estimate to be all possible quasar/galaxy associations with 5 quasars, we'd expect to find 1-2 cases like this one. Not definitive either way although I suspect we haven't looked at anywhere near all possible quasar/galaxy associations with 5 quasars.

For NGC 5985, there are 7 quasars with z between 0 and 3 and they have deltaz of 0.09, 0.15, 0.06, 0.008, 0.008, 0.164 and 0.172 .
So Pnew = 48200 * 0.09 * 0.15 * 0.06 * 0.008 * 0.008 * 0.164 * 0.172 = 0.0000705.

So PNEWzTotalNGC5985 = 0.0000705 * 1724 = 0.12

Again, not definitive but still small. I guess now the question of how many cases were examined before encountering each of these cases may need to be addressed in more detail. Anyone know? Also, the conclusions may be sensitive to what is assumed for the way quasars are distributed amongst low redshift galaxies. But based on this, so far, I'm still leaning toward the mainstream theory being faulty.

Continuing ...

For NGC 2639, there are 6 quasars with delta z of 0.005, 0.023, 0.037, 0.052, 0.106 and 0.01 . So Pnew = 10329 * 0.005 * 0.023 * 0.037 * 0.052 * 0.106 * 0.01 = 0.0000024

PNEWzTotalNGC2639 = 0.0000024 * 3018 = 0.007 .

Now that's a very small probability if one looked at all possible quasar/galaxy associations. So how many associations with 6 quasars this close have they found in the mainstream's database? Any one have an answer?

Finally, for NGC 1068 there are 12 quasars with deltaz of 0.039, 0.085, 0.132, 0.03, 0.049, 0.055, 0.084, 0.126, 0.104, 0.152, 0.142, 0.058

Pnew = 106679598 * 0.039 * 0.085 * 0.132 * 0.03 * 0.049 * 0.055 * 0.084 * 0.126 * 0.104 * 0.152 * 0.142 * 0.058 = 0.0000052

PNEWzTotalNGC1068 = 0.0000052 * 132 = 0.0007

Surely that's somewhat definitive?

And just for the heck of it, let's add one more case ... DRD's NGC 4030 which has 22 quasars with deltaz of 0.122, 0.122, 0.270, 0.09, 0.102, 0.382, 0.398, 0.334, 0.288, 0.252, 0.168, 0.178, 0.312, 0.528, 0.296, 0.22, 0.136, 0.03, 0.206, 0.266, 0.48, 0.144.

Pnew = 5.226 x 1014 * 0.122 * 0.122 * 0.27 * 0.09 * 0.102 * 0.382 * 0.398 * 0.334 * 0.288 * 0.252 * 0.168 * 0.178 * 0.312 * 0.528 * 0.296 * 0.22 * 0.136 * 0.03 * 0.206 * 0.266 * 0.48 * 0.144 = 0.35

PNEWzTotalNGC4030 = 0.35 * ? How many galaxies have we observed with 22 quasars near it? :D
 
Sol, I finally see what you are saying in your example. Ok. Obviously, my formula isn't quite right.

So you admit the person you said was "simply wrong, and not wise enough to know when to quit before embarrassing yourself further" was in fact completely right. Who, exactly, is embarrassing themselves?

Again, not definitive but still small. I guess now the question of how many cases were examined before encountering each of these cases may need to be addressed in more detail. Anyone know?
I'm confused. You are trying to calculate the probability of observing some apparently low probability event after x number of trials and yet have no idea what x is?
 
Last edited:
And since quasar z values are assumed independent in the mainstream theory (i.e., they have no connection to one another), the probability of seeing r quasars around a galaxy all have z values within a given deltaz of a Karlson value is ((2*deltaz*7)/3)r. Agree?

That is correct. But you must first specify a delta z and then check the data. You cannot use the data to specify delta z in this naive way, or you are doing a posteori statistics.

But since the quasar probabilities are independent and the observed quasar z have different deltaz to the nearest Karlsson value, I propose that the total probability can be found by separating each of the data points, finding the probability like I did above given it's specific deltaz and then multiplying their individual probabilities together. In other words, the probability of seeing r quasars around a galaxy all have z values within the distance they are to their nearest Karlsson value is

P = (2*deltaz1*7)/3 * (2*deltaz2*7)/3 * ... snip ... (2*deltazi=r*7)/3 = (14/3)r * (deltaz1 * deltaz2 * ... * deltazi=r)

Would you agree with that new formula, sol?

No - that is completely wrong for the reason I keep telling you. Since you don't know the math, perhaps it's again best illustrated by example.

To simplify the numbers, suppose the z range is (0,1), and suppose there is only one Karlsson peak, at z=.5. Now generate a data set of 10 QSOs with z values drawn from a flat distribution on (0,1). According to you, I should take that data set and compute "P" = (2*deltaz1) * (2*deltaz2) * ... snip ... (2*deltaz10) .

But this is obviously wrong, because in this situation 2*deltaz is always less than 1, and in fact will have arithmetic mean 1/2. So if each 2*deltaz were 1/2, we'd get P=.001. But actually the product (which is a geometric mean) will be dominated by the occasional smaller values, and its typical size is less than .00001 (if you don't believe me, generate some random numbers and check it yourself). So according to your formula the probability of a typical flat distribution is .00001, when it should be close to 1.

There is a correct way to do this kind of analysis. First of all, you cannot use a different value of deltaz for each point. You must pick one value for all of them. Secondly, you must be very careful if you use the data in any way to pick your delta z. Normally the theory you were comparing to would tell you how much to expect delta z to be, and you would use that.

But if the theoretical uncertainties are too large for that, one could do the following: take part of the data set and use it to find the delta z which maximizes the significance of that part. Now throw that part of the data away, and analyze the rest using the value of delta z you found in the first part.
 
Last edited:
... snip ...

How many galaxies have we observed with 22 quasars near it? :D
.

Waaaay back in post #98 of this thread I provided a link to a recent study of IR-selected AGNs in a ~ 9 deg2 field in the constellation of Boötes. For anyone reading this post who is also interested in getting an idea of how astronomers go about researching AGNs today, this is an excellent paper ... it covers a wide range of observational issues that need to be addressed in extragalactic multiwavelength surveys, and it illustrates how hard researchers work to test their findings for consistency, using different techniques and inputs (and much, much more).

Relevant to BeAChooser's question is the following:

* survey area of 8.5 deg2* 839 unobscured IR-selected AGNs with z > 0.7 found
* 640 obscured IR-selected AGNs with z > 0.7 found
* there are very few objects with z > 3.0 among these 1479 AGNs
* estimated completeness of unobscured AGNs of 90% (but read the paper carefully for the full details)
* completeness and contamination of the obscured AGNs is difficult to estimate reliably, but likely to be ~85% and <30% respectively (this simple summary is really quite inadequate).

This implies that the areal density of IR-selected unobscured AGNs with z > 0.7 (and < 3.0) is ~100 deg2, and of IR-selected obscured AGNs (in the same redshift range) is ~75 deg2.

Relevant to BeAChooser's question, within 30' of any point on the sky (away from the ZoA, etc), the average number of IR-selected AGNs would be ~135, and near a large (on the sky) galaxy such as NGC 4030, the number may well be higher, due to weak lensing. Of course, NGC 4030 being large, AGNs close to the inner parts (line of sight) would likely suffer significant extinction, so the total would drop somewhat.

We can also look at BeAChooser's question another way: how many galaxies are there in the central 5.8 deg2 field studied by Hickox et al. (i.e. leaving out those within 30' of the edge, and assuming the field studied was circular)?

I don't know, but the average number of IR-selected AGNs (0.7 < z < 3.0) in the 30' around each such galaxy would be ~135, which is, of course, way more than 22.

Astute readers will be asking themselves several questions, one of which is likely to be something like 'what is the definition of 'IR-selected'?' Of course, the paper spends quite some time defining this, but for our purposes we may summarise it as objects which have fluxes > 6.4, 8.8, 51, and 50 μJ in each of the four Spitzer IRAC bands (3.6, 4.5, 5.8, and 8 μm, respectively; i.e. clear detections in all four bands), AND z > 0.7.
 
So you admit the person you said was "simply wrong, and not wise enough to know when to quit before embarrassing yourself further" was in fact completely right.

Yes. Isn't nice to finally meet someone around here who is able to admit when he's wrong? Unlike so many that one meets around here. ;)

Who, exactly, is embarrassing themselves?

The folks who show no interest whatsoever in calculating the probability of seeing certain observations, given the mainstream assumptions about quasar numbers, quasar distribution with respect to low-redshift galaxies and quasar redshift?

I've asked and asked for one of the mainstream proponents to tell us what they think is the probability of seeing the observations I've noted. And none of them will. The only answer I seem able to get from them is a probability of 1, because we've seen one. But that's not an answer to the question I posed and you know it. It's an evasion.

So let me ask you, since you've stepped in as their advocate, what is the probability of seeing any of the 5 observations I noted above given the mainstream's assumptions regarding the various parameters in such a calculation? Should we see expect to observations like those (i.e., as close to the Karlsson values as these ones seemingly are) on average ever 10th galaxy? Every 10,000th? Every 1,000,000,000th? Should we have expected to see any of those observations if we'd looked at every single galaxy in the sky that has as many quasars around them as those do? Since you guys are just dismissing this question out of hand, you must think you already know the answer. So why don't you tell us that answer? :D

I'm confused. You are trying to calculate the probability of observing some apparently low probability event after x number of trials and yet have no idea what x is?

No, that's actually wrong. I have a estimate for what the maximum value of x could be if we accept what the mainstream has stated are the total number of observable quasars in the sky, make a reasonable (hopefully) assumption about how those quasars are numerically distributed with respect to low redshift galaxies, and assume that we were able to examine every possible quasar/galaxy association with that number of nearby quasars. And even using an x that large, the probabilities calculated above seem (with iteration #3 of the calculation methodology) to indicate its rather unlikely we would have seen some of those observations. And since the actual number of quasar/galaxy associations that have been studied in detail is probably much less than the total possible such associations, the likelihood that we'd have seen some of these observations is even smaller that what I calculated.

So, again, what do you claim is the expected probability of any of the 5 observations I've noted. And if you answer 1.0, you will only embarrass yourself. I'd like to see your procedure, by the way. :D
 
Yes. Isn't nice to finally meet someone around here who is able to admit when he's wrong? Unlike so many that one meets around here. ;)



The folks who show no interest whatsoever in calculating the probability of seeing certain observations, given the mainstream assumptions about quasar numbers, quasar distribution with respect to low-redshift galaxies and quasar redshift?

I've asked and asked for one of the mainstream proponents to tell us what they think is the probability of seeing the observations I've noted. And none of them will. The only answer I seem able to get from them is a probability of 1, because we've seen one. But that's not an answer to the question I posed and you know it. It's an evasion.
No, not really.

Ihave stated repeatedly that the best method to determine what the likelyhood of such an event is to measure it.

By taking samples around 'normative' areas, ie 'normative' galaxies and random points selected on the sky.

Then one obtains a 'representative' sample of the phenomena, and if teh sample groups are 1000 you have sort of an idea of what the probability/occurance is, if you have 10,000 in your sample groups you have a good idea what the probability/occurance is, and if you have sample groups of 100,000 then you have a fairly clear idea of what the probability/occurance might be.

That is what I have said all along.
So let me ask you, since you've stepped in as their advocate, what is the probability of seeing any of the 5 observations I noted above given the mainstream's assumptions regarding the various parameters in such a calculation? Should we see expect to observations like those (i.e., as close to the Karlsson values as these ones seemingly are) on average ever 10th galaxy? Every 10,000th? Every 1,000,000,000th? Should we have expected to see any of those observations if we'd looked at every single galaxy in the sky that has as many quasars around them as those do? Since you guys are just dismissing this question out of hand, you must think you already know the answer. So why don't you tell us that answer? :D
I understand why you might think that way. But again in piopulation demographics and sampling that is not usually the way it is approached.

Take ten hands of poker (five card no draw) dealt, returned, shuffled and dealt again.

The individual odds of a royal flush hearts are (1/52)5=2.63 v 10-9
Now does this mean that out of ten hands you should expect:

10 x 2.63 x10-9=2.63 x 10-8
and therefore if the royal flush hearts appears twice its actual probability of occurenceis

(2.63 x 10-8)2 and therefore if it did happen it means that the shuffle was not random?
 
... snip ...

The folks who show no interest whatsoever in calculating the probability of seeing certain observations, given the mainstream assumptions about quasar numbers, quasar distribution with respect to low-redshift galaxies and quasar redshift?

I've asked and asked for one of the mainstream proponents to tell us what they think is the probability of seeing the observations I've noted. And none of them will. The only answer I seem able to get from them is a probability of 1, because we've seen one. But that's not an answer to the question I posed and you know it. It's an evasion.

So let me ask you, since you've stepped in as their advocate, what is the probability of seeing any of the 5 observations I noted above given the mainstream's assumptions regarding the various parameters in such a calculation? Should we see expect to observations like those (i.e., as close to the Karlsson values as these ones seemingly are) on average ever 10th galaxy? Every 10,000th? Every 1,000,000,000th? Should we have expected to see any of those observations if we'd looked at every single galaxy in the sky that has as many quasars around them as those do? Since you guys are just dismissing this question out of hand, you must think you already know the answer. So why don't you tell us that answer? :D

Tubbythin said:
I'm confused. You are trying to calculate the probability of observing some apparently low probability event after x number of trials and yet have no idea what x is?

No, that's actually wrong. I have a estimate for what the maximum value of x could be if we accept what the mainstream has stated are the total number of observable quasars in the sky, make a reasonable (hopefully) assumption about how those quasars are numerically distributed with respect to low redshift galaxies, and assume that we were able to examine every possible quasar/galaxy association with that number of nearby quasars. And even using an x that large, the probabilities calculated above seem (with iteration #3 of the calculation methodology) to indicate its rather unlikely we would have seen some of those observations. And since the actual number of quasar/galaxy associations that have been studied in detail is probably much less than the total possible such associations, the likelihood that we'd have seen some of these observations is even smaller that what I calculated.

So, again, what do you claim is the expected probability of any of the 5 observations I've noted. And if you answer 1.0, you will only embarrass yourself. I'd like to see your procedure, by the way. :D
.

Well, I am firmly in the Tubbythin camp, confused.

First, I thought this thread was about how you, BeAChooser, came up with the conclusions that you have presented, about the approach you used, the hypothesis/ses (including the null hypothesis) being tested, and the calculations involved.

Second, although you have stated, many times, that you are seeking to test "mainstream assumptions", "mainstream theory", "mainstream models", and so on, every time I try to get a straight answer on just what those are, and how your test/approach/hypothesis/etc actually does that (starting with derivation from the relevant theories and models), I fail. In particular, every calculation you have presented contains 'Karlsson peaks' and 'minor axes' and 'near an active galaxy' and so on, all of which are either directly or indirectly taken from non-mainstream ideas! :eye-poppi

Third, every attempt to try to get you to specify your hypothesis/approach/etc via a concrete example, from someone other than you, has met with rebuff.

And so on.

Perhaps a direct answer to one of my latest posts (#386) would help?

Here it is again:
Here's an example of what I mean by different inputs:

Which galaxy/xies do I choose to run the calculations on? How do I know if I've chosen the 'right' galaxy (or kind of galaxy)? Can I choose any old low redshift galaxy? or must it be a galaxy found in one of Arp's papers?

Having chosen a galaxy, how far out do I look for quasars? 30'? 60'? 90'? more? How do I decide how far out I should look, in general?

Having decided how far out to go, how do I decide which objects within the circle to choose? Only those which are BSOs on Palomar Schmidt plates AND are x-ray sources? how about an AGN which is in a type 2 Seyfert? or a type 2 quasar? a BL Lac object?

Having selected my set of 'quasars', how do I decide which ones are 'on' the minor axis of my chosen galaxy? Do I say the cutoff is 45o (the criterion L-C&G used)? or is it 25o (what we might infer from reading some of the Arp et al. papers)? Or can I arbitrarily select a criterion?

Having selected my 'minor axis quasars', how do I calculate which Karlsson peak each is 'near'? Do I use one of Karlsson's papers for those peaks? or one of Arp's? In calculating 'distance' from a peak, what do I do if the redshift is 'near' the midpoint between two peaks? How do I incorporate stated observational uncertainty in the published redshifts?

So you see, BAC, there are a lot of things one must decide before even starting any calculation ... and I freely confess to be not knowing how to make any of the decisions in the chain I briefly outlined above.

And to reinforce my point: if you, BAC, are the only person who can say whether the inputs have been selected correctly or not (before a calculation begins), how can your approach be said to be objective?
 
Last edited:
I am having trouble understanding the period thing applied to redshifts of QSOs. like in this paper
:http://www.datasync.com/~rsf1/qso-rsa.htm
They devide the QSOs into slices based upon magnitude, but there is no real explanation of why or what might effect magnitude.

they don't even give what sort of factors might effect magnitude or why they chose the slice width that they did.

I don't get it, what does the magnitude have to do with it? It may not even be a relaiable indicator of distance.
.

The input data is a 1993 catalogue, and you'd have to read that catalogue very carefully to understand how it was compiled, how the various selection effects known at the time were addressed, and how it might be affected by selection effects that became known only after 1993.

I too cannot work out what conclusion Fritzius draws from the analysis, or even what conclusions would make sense ... is this related to BAC's approach in some way?
 
The original intent of the thread was to address the methodology of Arp and if his method can tell a random placement from a causal one. especially given the nature of sampling error.

Just for clarification.
 
You cannot use the data to specify delta z in this naive way, or you are doing a posteori statistics.

Sol, will you admit that the probability of seeing a given observation, given the assumptions behind the mainstream model with respect to quasar redshift, is the same regardless of when the data I'm asking that we study was gathered? That probability is implicit in the model itself, independent of the specific observation having been made or not. That probability is what we would EXPECT to see in that case given the model assumptions and nothing else. Right? Likewise, the total of observable quasars that the mainstream seems to think are in the sky is (for all intents and purposes) independent of these specific observations too. Right? And so is the distribution of quasars with respect to low redshift galaxies. That is (for all intents and purposes) independent of when these observations were made or when I do these calculations. Right?

And will you admit that the answer to what the probability is of seeing a given observation, given the model assumptions, is of more than passing interest if the answers turn out to be very low probabilities? Our interest here is in checking whether the assumptions in the mainstream model are correct. Suppose the calculated probability of seeing each of these observations came out to be 10-100, assuming we'd examined every possible observable quasar/galaxy association in the sky? Wouldn't that be telling us that the model is wrong if we'd then already seen 5 such observations? Sure, there is a very small probability that one could see those observations but wouldn't the better bet be that the model itself is defective in some important way? Of course it would.

Scientists (and engineers) build models all the time and then test those models against data they encounter to ensure the model works. This is no different. The argument about this being a meaningless or improper calculation because it uses a posteriori statistics is a smokescreen for ignoring legitimate indications of a possible problem with the model.

BeAChooser wrote:
P = (2*deltaz1*7)/3 * (2*deltaz2*7)/3 * ... snip ... (2*deltazi=r*7)/3 = (14/3)r * (deltaz1 * deltaz2 * ... * deltazi=r)
Would you agree with that new formula, sol?

No - that is completely wrong for the reason I keep telling you. Since you don't know the math, perhaps it's again best illustrated by example.

To simplify the numbers, suppose the z range is (0,1), and suppose there is only one Karlsson peak, at z=.5. Now generate a data set of 10 QSOs with z values drawn from a flat distribution on (0,1). According to you, I should take that data set and compute "P" = (2*deltaz1) * (2*deltaz2) * ... snip ... (2*deltaz10) .

But this is obviously wrong, because in this situation 2*deltaz is always less than 1

It's not wrong. Not for the reason you've stated (or any reason I can see). 2*deltaz had better be less than 1 because it's a probability that goes from 0 to 1, as deltaz increases from 0 to half the width of the range (in your example). Perhaps you have misunderstood what I meant by deltaz? It's not the width of the zone in which a quasar lies centered about the Karlsson value. It's the distance of that specific data point z from the Karlsson value ... so it's one-half the zone width.

We can check that my formula is right for one data point by looking at the answer in your example if we let deltaz for a given quasar equal half the possible range (i.e., deltaz = 1/2 of 1). Then the probability is 2*deltaz1 = 1 of that quasar being within 1/2 of the midpoint. As it should be. The probability of finding a quasar that lies between 0 and 1 being somewhere between 0 and 1 is indeed 1. And if you have 10 QSO's that all have z within 0.5 (i.e., deltaz = 0.5) of a Karlsson value that is at the midpoint of the range 0 to 1, then the probability of finding that case is in fact 1 * 1 * 1 * etc. = 1.0 (assuming quasars are independent).

So if each 2*deltaz were 1/2, we'd get P=.001.

What's wrong with that? The probability of each quasar in that case being within deltaz = 0.25 (note that 2*deltaz = 1/2 means deltaz equals 0.25) of the midpoint in the range 0 to 1 is 0.5. And since each data point is independent, the probability of all of them being within 0.25 of the midpoint must be .510 = 0.000977 = 0.001. Again, I think you've misunderstood what I deltaz means.

But actually the product (which is a geometric mean) will be dominated by the occasional smaller values

And why shouldn't the product be dominated by the lowest probability data points if the data points are independent. Say we have 2 QSOs in your example instead of 10. One QSO has a deltaz of 0.5 so the probability of finding it is 1. One QSO has a deltaz of 0.001 so the probability of finding it, assuming quasars are randomly drawn from the range 0 to 1, is .002. The joint (multiplicative) probability is 1 * 0.002 = 0.002. That's the correct answer for finding both data points if they are independent events ... and it is indeed dominated by the lowest probability data point.

There is a correct way to do this kind of analysis. First of all, you cannot use a different value of deltaz for each point.

Sure you can. Say we are talking about your 1 Karlsson value case with 2 quasars. The quasars are independent of one another. Correct? According to the mainstream model there is not supposed to be any physical connection between two quasars in a given viewing field and the value of z for each is supposed to be independent of one another ... just come from the same overall distribution. The probability of quasar1 being within deltaz1 of a given Karlsson value is known (based on the above) and independent of the probability of quasar2 being within deltaz2 of that same Karlsson value, which is also known. Since they are independent, the joint probability of there being 2 quasars within those two specific distances of the same Karlsson value is the product of the probability of each quasar being within it's specified deltaz. Right? So there's nothing wrong with the equation for 1 Karlsson value and 2 quasars. Adding more quasars doesn't change any of the above logic either. So the new formula is correct for 1 Karlsson value and any number of quasars.

If there is more than 1 Karlsson value in the total z range, then things get a little more complicated. But just a little. Let's again look at the case of 2 quasars. Say I find a deltaz to the nearest Karlsson value for the first quasar. Call it deltaznearest. The probability of that quasar being within deltaznearest of that Karlsson value is 2*deltaznearest. Now if the other Karlsson value can be anywhere in the 0 to 1 range, the probability of that quasar being within 2*deltaznearest of that Karlsson value is also at most 2*deltaznearest. So at the very least, I'm conservative in estimating the probability of it being within deltaznearest of both Karlsson values as twice the probability of it being within deltaznearest of the nearest one. And I can do the same for the other quasar, independent of the first, using its own different deltaznearest. And since they are independent quasars, I can multiply the two probabilities I've obtained to get the joint probability of both quasars being within their respective deltaz of both Karlsson values. Which is exactly what my new equation does. And again, adding more quasars doesn't change to validity of any of the above logic, nor does adding more Karlsson values. So I think you are wrong about your conclusion that this formula doesn't work for the purpose at hand.

I probably should modify the equation so there is no confusion and future misuse of it, however. It should read:

P <= ((2*nk)/3)r * (deltaz1 * deltaz2 * ... * deltazi=r)

where

nk is the number of Karlsson values in the range 0-3,

r is the number of quasars,

and deltazi is the distance to the NEAREST Karlsson value of the ith quasar,

assuming the distribution of quasar z is uniform in the range 0-3.

With regard to that last condition, z is not exactly uniform over the whole range as I noted in my earlier posts. Do you have any suggestions, sol, on how to accurately incorporate that fact into the analysis ... assuming we can agree on everything else above? I previously tried a power law weighting where each of the deltazi would be raised to a weight that is the ratio of the value of frequency of that z in the real distribution compared to what the value would be were the distribution uniform and the overall area under the frequency distribution were the same as in the real distribution. Do you think that a valid approach? Are the weights in that case meaningful? Note that the result, depending on the specific case, may either raise or lower the final probability.
 

Back
Top Bottom