• Quick note - the problem with Youtube videos not embedding on the forum appears to have been fixed, thanks to ZiprHead. If you do still see problems let me know.

"A measurable function of a random variable is a random variable"?

mijopaalmc

Philosopher
Joined
Mar 10, 2007
Messages
7,172
The title of the thread is a frequently stated (often without a proof) theorem in probability theory an statitistics.

In another thread, someone pointed out to me that functions such as f(X)=0X+c and f(X)=X^0+c, where X is a random variable and c is a real number, are not random variables because they yield the same value for ever value of the random variable X.

Now, this seems sensible; however, it does seem strange that such a proof of such an important theorem in probability theory and statistics would have such an often overlooked, obvious counterexample.

Could someone help me reconcile the proof with the apparent counterexample?

Note: I would be happy to provide a more thorough explanation of my understanding of the basic concepts of probability theory. I did not want to make this post needlessly long though.
 
Non-trivially, certainly functions gain a degree of predictability that random numbers do not have. For instance, X is a random real number

f(x) = x^2 is a random, non-negative real number
f(x) = x^(1/2) will significantly influence the distribution of random numbers in a predictable manner (i.e. it will look 'less random').

So I'm not so sure that the results of these functions have the same degree of randomness as the original function.
 
Conventionally, if a function is declared as a function of another variable, e.g. f(X) in your example, it means the result of f(X) is dependent on the value of X.

Look at your examples. Neither of them are dependent on the value of X; they are both independent of X. Therefore, these functions are not functions of X as stated.
 
In another thread, someone pointed out to me that functions such as f(X)=0X+c and f(X)=X^0+c, where X is a random variable and c is a real number, are not random variables because they yield the same value for ever value of the random variable X.

When you talk about "measurable functions" and "random variables" you're into set theory, which is way over my head. But just from a naive, natural language standpoint, it seems to me that when people say "a function of a random variable" that's probably just shorthand for "a function whose value is dependent on the value of a random variable."
 
Last edited:
A random variable is just a measurable function on a probability space. Even though a function roughly means something whose value depends on its input, functions taking constant values are allowed as functions. So you can have a random variable that is constant.

This sort of inclusion of trivial cases is almost universal in mathematics: without it you'd need exceptions all over the place. For example, if X and Y are two random variables on the same space, then X-Y is a random variable. But of course it might be constant even if X and Y are not.

ETA: To CoolSceptic - for just the same reason the function from the real numbers to the real numbers which assigns the value 27 to any input is regarded as a function by all mathematicians.
 
Last edited:
Conventionally, if a function is declared as a function of another variable, e.g. f(X) in your example, it means the result of f(X) is dependent on the value of X.

Look at your examples. Neither of them are dependent on the value of X; they are both independent of X. Therefore, these functions are not functions of X as stated.

Except, as I understand it, there is a constant function that has the same value regardless of the independent. In this case, f maps every point in its domain to the same point in its codomain and is still a function because no more than one point in its codomain is mapped to by any single point in its domain.
 
Last edited:
In another thread, someone pointed out to me that functions such as f(X)=0X+c and f(X)=X^0+c, where X is a random variable and c is a real number, are not random variables because they yield the same value for ever value of the random variable X.

Now, this seems sensible; however, it does seem strange that such a proof of such an important theorem in probability theory and statistics would have such an often overlooked, obvious counterexample.

Could someone help me reconcile the proof with the apparent counterexample?

The proof is using a definition of "random variable" according to which a constant is a random variable: it simply happens to take on one particular value with probability 1, and every other value with probability 0.

(And, on previewing my post, I see Meridian just said the same thing.)
 
Your posting states that the function taking constant values is a trivial case and I agree.
But it would be nice if the proof in the OP was addressed.
 
Ah, I knew there was a reason I always avoided set theory like the plague.

RealityCheck, I think the point is not just that a topological mapping from some set of numbers to a constant is a valid (albeit trivial) function, but that a constant is a valid (albeit trivial) random variable.

It's all in the definitions, of course :)
 
Your posting states that the function taking constant values is a trivial case and I agree.
But it would be nice if the proof in the OP was addressed.

I'm not quite sure what you mean: the question in the OP has been addressed. If you'd like a proof of the theorem - it proves itself once you know the definitions. Roughly speaking, a function [LATEX]$f:X\to Y$[/LATEX] is measurable if for every measurable subset [LATEX]$B$[/LATEX] of [LATEX]$Y$[/LATEX] its preimage under [LATEX]$f$[/LATEX], i.e., [LATEX]$f^{-1}(B)=\{x\in X:f(x)\in B\}$[/LATEX] is a measurable subset of [LATEX]$X$[/LATEX]. What measurable means may be different for the two spaces: more formally you have to specify in advance a collection of measurable sets satisfying certain rules. If the spaces are the real numbers, the default collection is the Borel measurable sets (or the Lebesgue measurable sets).

Anyway, a constant function is always measurable, since the preimage of any set under a constant function is either the empty set or the whole space [LATEX]$X$[/LATEX], both of which are always measurable. For the theorem: a random variable [LATEX]$W$[/LATEX] simply means a measurable function from some probability space [LATEX]$\Omega$[/LATEX] to some other space [LATEX]$X$[/LATEX], usually the real numbers. Then as a function, the random variable [LATEX]$f(W)$[/LATEX] is just the composition [LATEX]$f\circ W$[/LATEX] of [LATEX]$W$[/LATEX] and[LATEX]$f$[/LATEX]. Since the composition of two measurable functions is measurable (easy from the definition), [LATEX]$f$[/LATEX] measurable implies [LATEX]$f(W)$[/LATEX] measurable.
 
I'm not quite sure what you mean: the question in the OP has been addressed. If you'd like a proof of the theorem - it proves itself once you know the definitions. Roughly speaking, a function [LATEX]$f:X\to Y$[/LATEX] is measurable if for every measurable subset [LATEX]$B$[/LATEX] of [LATEX]$Y$[/LATEX] its preimage under [LATEX]$f$[/LATEX], i.e., [LATEX]$f^{-1}(B)=\{x\in X:f(x)\in B\}$[/LATEX] is a measurable subset of [LATEX]$X$[/LATEX]. What measurable means may be different for the two spaces: more formally you have to specify in advance a collection of measurable sets satisfying certain rules. If the spaces are the real numbers, the default collection is the Borel measurable sets (or the Lebesgue measurable sets).

Anyway, a constant function is always measurable, since the preimage of any set under a constant function is either the empty set or the whole space [LATEX]$X$[/LATEX], both of which are always measurable. For the theorem: a random variable [LATEX]$W$[/LATEX] simply means a measurable function from some probability space [LATEX]$\Omega$[/LATEX] to some other space [LATEX]$X$[/LATEX], usually the real numbers. Then as a function, the random variable [LATEX]$f(W)$[/LATEX] is just the composition [LATEX]$f\circ W$[/LATEX] of [LATEX]$W$[/LATEX] and[LATEX]$f$[/LATEX]. Since the composition of two measurable functions is measurable (easy from the definition), [LATEX]$f$[/LATEX] measurable implies [LATEX]$f(W)$[/LATEX] measurable.

<-------------- Your explanation



[my poor, aching head]
 
For a function not to be measurable, it needs to be "very complicated". The reason why a constant might intuitively seem not to qualify as a random variable is that it's "too simple". But too simple is ok; too complicated is what causes trouble.

Regarding the definition of "y is a function of x" as "y depends on x", it's probably better to think of it as meaning "y doesn't depend on anything other than x". That way, it's clearer that a constant y satisfies the definition.
 
I'm not quite sure what you mean: the question in the OP has been addressed. If you'd like a proof of the theorem - it proves itself once you know the definitions. Roughly speaking, a function
latex.php
is measurable if for every measurable subset
latex.php
of
latex.php
its preimage under
latex.php
, i.e.,
latex.php
is a measurable subset of
latex.php
. What measurable means may be different for the two spaces: more formally you have to specify in advance a collection of measurable sets satisfying certain rules. If the spaces are the real numbers, the default collection is the Borel measurable sets (or the Lebesgue measurable sets).

Anyway, a constant function is always measurable, since the preimage of any set under a constant function is either the empty set or the whole space
latex.php
, both of which are always measurable. For the theorem: a random variable
latex.php
simply means a measurable function from some probability space
latex.php
to some other space
latex.php
, usually the real numbers. Then as a function, the random variable
latex.php
is just the composition
latex.php
of
latex.php
and
latex.php
. Since the composition of two measurable functions is measurable (easy from the definition),
latex.php
measurable implies
latex.php
measurable.

That is what I was looking for - a proof that a constant function is always measurable. Thus a constant function is a Borel measurable function, the proof in the OP includes it and it is not a counterexample.

That should satisfy mijopaalmc.
 
[my poor, aching head]

You're not missing much! I'm not sure why I posted it, since the whole thing is one of those things that's almost certainly only interesting if you are a mathematician. The whole business with measurable sets and measurable functions arises because when you try to make some common sense stuff work mathematically, you run into contradictions, so it turns out you need some restrictions on the definitions. It's very much like needing the axioms of set theory to avoid the `set of all sets not belonging to themselves' paradox.

In fact, any function you can possibly conceive of is measurable. In practice, a random variable is just a function on a probability space, assigning a number, e.g., number of heads, to each possible outcome. So a function of a random variable is still a random variable. A constant function is a function, or a constant random variable (e.g., number of heads + number of tails, if you toss 3 coins) is a random variable, for the same reason that 0 is an integer or a square is a rectangle.
 
Your posting states that the function taking constant values is a trivial case and I agree.
But it would be nice if the proof in the OP was addressed.

When I read this thread, I think what's needed is to take a step back and explain the ideas behind the whole stuff. It's 20 years since I've looked at either probability theory or measure theory, so please correct me if I goof up on some of the stuff.

In probability theory, the basic notion is a stochast X which "records" your events. If your event is something like the throw of a dice, you can speak of the chance you throw a 5: P[X = 5], or another value. However, when your event can have an arbitrary real value - say, you take someone's length - then it's senseless to speak of P[X = 1.83] as that chance is 0. Then you have a density function f(x) to capture the relative chance of getting that value - relative to all other values - and to really calculate the chance for some set of values to be hit, you take the integral. The usual distribution function taken is its primitive, so

[latex]$$F(x) = \int_{-\infty}^{x} f(t) dt$$[/latex]

which gives the chance P[X < x] that your event records a smaller value than x. So the issue of being able to work with a stochast is that the density function is integrable. And given that it is integrable, you may then next pose the question for which sets S you want to know the chance that your stochast "hits" the set:

[latex]$$P[X \in S] = \int_{t \in S} f(t) dt$$[/latex]

That's the kind of look at integrals you're probably not used to from high school mathematics, and that's where measure theory comes in. Measure theory is about setting up a system of subsets -- called "measurable sets -- of a given set for which it makes sense to define integrals. So the above formula makes sense if S is a measurable set. Think in the first place for S of sets like intervals [1.5, 3.14] -- that's where measure theory on real numbers starts with -- but also of (arbitrary) unions and intersections of those.

In particular, measure theory assigns to every measurable set a "size" of the set (just integrate the constant function 1 over the set).

When you look at the definition of random variable, you see it is in effect already a function from a probability space S1 to a measurable space S2. A measurable space is a set that is endowed with a system of measurable subsets; a probability space is a measurable space where the whole set has size 1: which is logical as the probability of any result at all is per definition 1.

Think of the space S1 of the possible events ("taking someone's length"), and S2 as their translation into some number; when you nitpick, they're two different spaces. The translation has to be a measurable function which means that the function behaves nice w.r.t. to the measure systems on S1 and S2, so the question P[X \in T], where T is a measurable subset of S2, can be calculated. The word "random" is maybe misleading; you could already assign to a random variable always the value 3, whatever the outcome of the event.

The theorem says that "taking a measurable function g of a random variable is again a random variable". If you look at the definitions, that just means that you compose the transformation function from the random variable with another (measurable) function, you again get a measurable function mapping from the original event space S1 now to a new space S3, and again you can calculate P[g(x) \in T] for any measurable subset T of S3.

So, in short, random variables are not really about randomness but more about being able to ask "what's the chance the event has a value in some reasonable set?", and measurable functions are about behaving nicely w.r.t. those reasonable sets.
 
In fact, any function you can possibly conceive of is measurable.
Such a statement is just an invitation to mess with people's heads :rolleyes:

What about:

f(x) = 1 if x in T
f(x) = 0 if x not in T

where T is some non-measurable set? I forgot, is Q measurable?
 
Such a statement is just an invitation to mess with people's heads :rolleyes:

What about:

f(x) = 1 if x in T
f(x) = 0 if x not in T

where T is some non-measurable set? I forgot, is Q measurable?

Yes - any set you can conceive of is measurable :)

In fact, I'm on pretty safe ground. It's impossible to explicitly construct a non (Lebesgue-)measurable set. To prove one exists you need to use the axiom of choice.
 
Yes - any set you can conceive of is measurable :)

In fact, I'm on pretty safe ground. It's impossible to explicitly construct a non (Lebesgue-)measurable set. To prove one exists you need to use the axiom of choice.
Thanks. Of course, I had to whip out my undergrad lecture notes to verify your answer. You can't trust just some guy on the internet, can you :).

The style of your answer suggests you don't adhere to the axiom of choice? You're missing out then on a lot of mathematics ;).

I'm pro-choice :D
 

Back
Top Bottom