Now my reputation is at stake, so I am going to clean up the mess that I didn’t create.
Stop blaming your failures on your critics. As soon as you stop pretending to be something you aren't, most of your troubles will go away.
One of my opponents wrote that categorical variables are used in clinical studies (I think everyone knows whom I am talking about).
Yes, we all know you're talking about me. The only reason for you to play coy and to avoid quoting where I supposedly said what you attribute to me would be if you plan to lie about what I said and make it hard for the readers to detect the lie. If you want to address what I say, quote where I said it.
Apparently, he doesn’t understand the difference between categorical data and categorical variables.
Of course I do, your frantic attempt below to split hairs notwithstanding. There is no material difference. You're just trying to manufacture one so that you can say one way was the way in which you were right earlier, and the other way was the way Jay was wrong.
I also notice that your sudden interest in multivariate analysis didn't start until I mentioned the word yesterday or the day before. Is there any point pretending that you're not just frantically Googling for words I mention, hoping to fool someone into thinking you actually know what you're talking about? The same thing happened with "t-test." You had no clue what it was until I mentioned it. Than you fall all over yourself trying to demonstrate that you knew about it all along.
Frankly, I couldn’t care less about his/her opinion.
Yet you seem to spend a lot of time and energy trying to make other people share your judgment. You can't or won't address the content of my posts, but you spend so much time telling everyone how dumb I must be. It takes no knowledge or skill to do that, just lots of insecurity.
“...Sometimes discrete variables are used in multivariate analysis in place of continuous ones if there are numerous categories, and the categories represent a quantitative attribute."
And sometimes the categories
do not represent a quantitative attribute, such as when they represent blood type or gender or race or whether they have Type II diabetes or whether their was any history of cancer in the previous two generations. You cite an example of ordinal categorization, but you don't describe any of the other kinds. There is no hierarchy or ordering in blood types, so blood type is a pure categorical variable. So just another straw man. You cite one mode of categorization, and think that the other modes just go away because you didn't mention them. Do you realize that there are people here who already know about all this stuff, and when you blatantly cherry-pick the literature as you just did, they know for a fact that you're trying to deceive people?
I already explained why categorical variables are not used in clinical studies.
No, you didn't
explain anything. You simply declared them not to apply, and then subsequently ignored the explanation of how they are actually used. Now, as usual, you've discovered your error and are trying to disguise your admission of it by framing it in more cobbled-up aspersions.
“Categorical data (normal or ordinal data)..."
See here how he mentions "normal" data, but your explanation above is limited to ordinal data? See how you accidentally showed that there's more to the definition of a categorical variable then you're letting your readers know about? See how we can tell that you either don't understand your sources or are deliberately misrepresenting them?
"...are counts of the member of observations in each category."
Yep, one for each patient. The variable is "Blood Type," just like a continuous variable, in contrast, might be called "Age." Each patient has one age, which is a continuous-valued variable. Each patient has one blood type, which is a categorically-valued variable. The aggregation of the values for each of the variables, for a sample, are collectively called data. There is no magical distinction of the type you're trying to draw.
"Oh, Jay was talking about categorical
variables, but I was talking about categorical
data. There's a big difference." No you weren't, and no there isn't. You specifically railed against the idea of categorical data, because you couldn't figure out how to make the few analysis methods you knew about work with them. Because they didn't fit the knowledge you had, you dismissed the whole concept as outside the realm of statistics.
You could have handled this a number of ways. You could have said, "Yes, Jay and Alec were right and I was wrong. I'll be more careful in the future." Or, if saving face is important, you could have softened the admission. "I misunderstood what Jay and Alec were saying, so I looked it up and now I get what they're trying to say." That too would have been acceptable. What's comically arrogant is for you to double-down on the error and try to come up with a lame story for why the facts are the way they are, but you're still somehow right and your opponents here are still somehow wrong. That's insulting. What this says to anyone who would contemplate interacting with you -- including your "professional mathematicians" -- is that you'll do absolutely whatever it takes, and stoop to any deception, in order not to admit you're wrong. That's not healthy.
"Such data are often described with percentages or other ratios. For example, if a sample is divided into four nominal categories on the basis of blood type, the number of patients in each category might be presented as four percentages with total 100%”.
Yes, what a shocker. The data that populate these variables, whether categorical or continuous, form a distribution just like all the other data in statistics. If you had paid attention to my discussion of Jeffers, you would have been able to discuss the effects of discretization on the distribution of data for a particular variable. Back then you said it was "irrelevant." Now it seems to be relevant again, because you suddenly learned what it was.
Except in a pure category like blood type there's no ordinality and therefore no curve. The distribution occurs as a different kind of mathematical construct -- a set. Since the theories that apply to ordinal univariates have no footing here, there is an entirely different set of statistics that we use, such as the chi-square test for independence. I've mentioned that example several times. This appears to be a branch of statistics that you are completely unaware of. Yet it's quite commonly used in the analysis of human-subjects data. The volitional variable you keep stumbling over with respect to Operator 010 is a categorical variable. The data collected according to that variable for each operator is, collectively, categorical data. It's meaningless to try to separate those concepts.
Categorical data is used in clinical trials, albeit not on regular basis.
Nice try. You finally figured out what "categorical" means, and you finally looked at some real medical studies and found out that categorical variables (and the data that populate them) are indeed widely used in medical research. And so now, in order to save face, you're trying some Hail-Mary duplicity to make it seem like your error occurred only because Jay still somehow screwed up.
Pathetic.
One more thing – blood type is not a categorical variable because it has only one value for each patient...
Bwahahaha! You're so tied up in knots you seem to think a patient could have multiple values for a single "categorical variable." That's all the proof the world needs that you have absolutely no clue what you're talking about. You cited part of a definition and then displayed that you didn't understand a word of it. This is the most cargo-culty thing I think I've seen today.
Blood type is a categorical data, as the above quotation shows.
Yes it is, but not in contrast to the term "categorical variable," a difference that exists only in your head. The concept of blood type is a categorical variable. The values for that variable naturally occur as categories. When designing the experiment, the experimenters might well say something like, "We want to be sure to correlate the outcomes with blood type." That's a reference to the concept of the variable. It names a quantity that varies from subject to subject in a categorical fashion, just like "age" names a quantity that varies from subject to subject in a continuous fashion. It is meaningless to try to separate the concept of a variable from the concept of the values that variable can have, and the values it
does have for some sample.