So the null hypothesis in your supposed version is actually "All UFO sightings are the result of mundane explanations." Expecting to see no difference in distribution of characteristics between identified and unidentified cases is a test of that null hypothesis.
Perhaps a note of clarification is in order then:
So if the statement being tested is (correcting for the fact that “explanations” cannot cause anything):
H0: "All UFO sightings are the result of a misidentification of mundane objects "
Then the alternative hypothesis (to be accepted if the null is rejected) would be:
Ha: "Not all UFO sightings are the result of a misidentification of mundane objects "
However, we must come up with an experimental paradigm that allows us to
test the null hypothesis. Simply saying produce an ET, even though that would falsify the null hypothesis, since ostensibly no-one has been able to capture an ET and to present it, is not going to give us a workable test that we can conduct here and now (although if you happen to know of someone who
can produce an ET…LOL).
So, we need an alternative test of the null hypothesis and in that respect I have proposed:
If the H0 is true, then we would expect no difference on defined characteristics between known category reports and unknown category reports.
This is because if all reports arise from mundane objects, then we would expect the distribution of the characteristics of those objects to be evenly distributed throughout all reports.
Now of course there may be factors that in turn may falsify
that assumption – but if there are, we must then control for those factors. I can think of one factor - that of “reliability” of reports.
It may be that the less reliable a report, the more likely it will be to result in an unknown categorisation (thus skewing the distribution).
Of course we must then test the reports for reliability and factor that into our calculations. That is, before testing our null hypothesis, we must test another hypothesis – that is:
Does reliability affect report categorisation in such a way that the less reliable the report, the more likely an unknown categorisation will result. Once we have the answer to that question, then we can account for it in the test of our original H
0.
Sound reasonable?
(That’s one of wollery’s posts attended to…

)