Bernoulli and Uniform Random Variables

probability

mathematics

notes

llm

Why the uniform and Bernoulli \(p=1/2\) are special.

Author

Stephen J. Mildenhall

Published

2025-12-29

Modified

2025-12-29

This post reports some responses from GPT 5.2 to questions about atomless probability spaces.

1 Background

Let \((\Omega,\mathcal F,\mathsf P)\) be a probability space. A random variable \(U:\Omega\to[0,1]\) is uniform if \[ \mathsf P(U\le u)=u,\qquad u\in[0,1]. \] Equivalently, \(U\) has distribution \(\mathrm{Unif}(0,1)\).

Uniform random variables play a distinguished role because any distribution on \(\mathbb R\) can be generated from a uniform one via quantile transforms. If \(F\) is a distribution function and \(U\sim\mathrm{Unif}(0,1)\), then \[ X = F^{-1}(U) \] has distribution \(F\) (with the usual generalized inverse). Thus the existence of a single uniform random variable implies the existence of random variables with essentially arbitrary laws.

An event \(A\in\mathcal F\) with \(\mathsf P(A)>0\) is an atom if every measurable \(B\subset A\) satisfies either \(\mathsf P(B)=0\) or \(\mathsf P(B)=\mathsf P(A)\). Intuitively, an atom is a chunk of probability mass that cannot be subdivided.

The probability space is atomless if it has no atoms. Equivalently, for every \(A\) with \(\mathsf P(A)>0\) and every \(p\in(0,\mathsf P(A))\), there exists \(B\subset A\) with \(\mathsf P(B)=p\).

Atomlessness is the key structural property underlying the existence of continuous random variables. The following statements are equivalent. Any one may be taken as a standing richness assumption.

Atomlessness. \((\Omega,\mathcal F,\mathsf P)\) is atomless.
Existence of a uniform random variable. There exists \(U:\Omega\to[0,1]\) with \(U\sim\mathrm{Unif}(0,1)\).
Existence of i.i.d. Bernoulli sequences. There exists a sequence \((B_n)_{n\ge1}\) of independent, identically distributed Bernoulli\((1/2)\) random variables.
Existence of i.i.d. sequences with arbitrary law. For every probability measure \(\mu\) on \(\mathbb R\), there exists a sequence \((X_n)_{n\ge1}\) of i.i.d. random variables with common distribution \(\mu\).
Support of a continuous random variable. There exists a random variable \(X\) whose distribution has no atoms (equivalently, \(\mathsf P(X=x)=0\) for all \(x\)).

2 Why \(p=1/2\) and the Uniform are Special

The Bernoulli\((1/2)\) sequence allows to build a uniform via the dyadic expansion. You can replace it with Bernoulli\((p)\) but you do not get something equivalent to a uniform unless \(p=1/2\).

2.1 Why Bernoulli\((1/2)\) gives a uniform

Given i.i.d. Bernoulli\((1/2)\) variables \((B_n)\), define \[ U = \sum_{n=1}^\infty 2^{-n} B_n. \] This is exactly the binary expansion of a number in \([0,1]\). The product measure on \(\{0,1\}^\mathbb N\) with equal weights pushes forward to Lebesgue measure on \([0,1]\), modulo the usual dyadic non-uniqueness, which is null. This works because:

the digits are i.i.d.,
each digit has entropy \(\log 2\),
the scaling \(2^{-n}\) matches the branching factor.

It is not an accident—it is a very rigid coincidence.

2.2 What happens with Bernoulli\((p)\), \(p\neq 1/2\)

If you keep the same dyadic map \[ X_p := \sum_{n=1}^\infty 2^{-n} B_n,\qquad B_n\sim\text{Bernoulli}(p), \] the resulting distribution is a Bernoulli convolution with parameter \(1/2\) and bias \(p\). Facts:

For \(p\neq 1/2\), the measure is singular with respect to Lebesgue.
It lives on all of \([0,1]\) but concentrates on a fractal set.
The CDF is continuous but typically nowhere differentiable.
You lose absolute continuity, quantiles behave pathologically, and coupling arguments break.

So while \(X_p\) is continuous (no atoms), it is not uniform and not interchangeable with \(U\) for measure-theoretic work. This is why in [1] and elsewhere the Bernoulli\((1/2)\) case is singled out: it is the only bias compatible with Lebesgue under base-2 scaling.

2.3 Could you “use” \(X_p\) instead of a uniform?

That depends what you mean by use.

2.3.1 What you still get

The probability space remains atomless.
You can generate continuous random variables.
You can support i.i.d. sequences.
Law invariance results still hold abstractly.

2.3.2 What you lose

Quantile transforms no longer preserve smoothness.
Rearrangement arguments become ugly.
Any construction that implicitly assumes Lebesgue measure on \([0,1]\) breaks.
Spectral risk measures become painful to represent explicitly.

So theoretically yes, practically no—unless you enjoy working with singular measures. This is why essentially all of probability theory quietly normalizes to Lebesgue via a uniform.

2.4 Is there a \(p\)-adic or symbolic-dynamics analog?

Yes—but it lives in a different universe.

2.4.1 Symbolic dynamics view

The space \(\{0,1\}^\mathbb N\) with Bernoulli\((p)\) measure is a perfectly natural probability space. The dyadic map \[ \pi:\{0,1\}^\mathbb N \to [0,1] \] is a coding map. For \(p=1/2\), \(\pi\) is measure-preserving onto Lebesgue. For \(p\neq 1/2\), \(\pi\) is not.

The “right” space for Bernoulli\((p)\) is the shift space, not \([0,1]\).

2.4.2 \(p\)-adic analogy

In the \(p\)-adic integers \(\mathbb Z_p\), Haar measure corresponds to uniform digits in base \(p\). That is the exact analog of Bernoulli\((1/2)\) → Lebesgue.

If you bias the digits in \(\mathbb Z_p\), you again get a singular measure with respect to Haar. So the phenomenon is not special to base 2 or to \(\mathbb R\)—it is structural.

Uniform measure ↔︎ Haar measure ↔︎ unbiased digits Bias the digits ↔︎ singular measure

There is a \(p\)-adic analogy, but it reinforces the same moral rather than escaping it.

2.5 Bottom line

And why the textbooks do what they do:

Bernoulli\((1/2)\) is not chosen for convenience; it is forced if you want Lebesgue.
Bernoulli\((p)\) works abstractly but destroys regularity.
There is no alternative “nice” continuous reference measure on \([0,1]\) coming from biased digits.
Ryff-type rearrangement and law-invariance arguments implicitly rely on the Lebesgue structure.

Related concepts: Bernoulli convolutions, symbolic dynamics, and Haar measure on compact groups—are a rich area, but one that confirms rather than undermines the canonical role of the uniform.

3 Why always a countable collection?

You can make uncountable collections but only in a very specific sense, and not in the naïve sense people often mean. The obstruction is measurability, not probability.

3.1 Countable vs uncountable independence

On a standard (atomless) probability space you can construct countably many independent random variables with essentially arbitrary laws. That is routine and equivalent to supporting a single uniform and taking products.

For an uncountable index set \(I\) (say \(I=[0,1]\)), the situation changes qualitatively.

3.2 What is possible: abstract product measures

Fix any family of probability measures \(\{\mu_i : i\in I\}\) on measurable spaces \((E_i,\mathcal E_i)\).

There exists a probability measure on the product space \[ \Bigl(\prod_{i\in I} E_i,;\bigotimes_{i\in I}\mathcal E_i\Bigr) \] such that the coordinate projections \((X_i)_{i\in I}\) are independent and \(X_i\sim\mu_i\).

This is guaranteed by the Kolmogorov extension / product-measure construction and does not require \(I\) to be countable.

So in a formal sense one can define a continuum of independent random variables.

3.3 What fails: joint measurability and paths

The problem is that once \(I\) is uncountable, the object \[ \omega \mapsto (X_i(\omega))_{i\in I} \] no longer behaves like a “random function” in any reasonable measurable sense.

3.3.1 No jointly measurable process

You cannot have a map \[ (\omega,i)\mapsto X_i(\omega) \] that is measurable with respect to \(\mathcal F\otimes\mathcal B(I)\) and has independent coordinates indexed by a continuum. This is a theorem, not a pathology.

3.3.2 Almost surely pathological sample paths

If \((X_t)_{t\in[0,1]}\) are independent and non-degenerate, then with probability one:

the map \(t\mapsto X_t(\omega)\) is nowhere continuous,
nowhere measurable,
has no modification with measurable paths.

In fact, it is “worse than white noise.” White noise only exists as a generalized random object; pointwise independent noise does not exist as a function-valued random variable.

3.3.3 No filtration indexed by a continuum

You cannot form a useful filtration \((\mathcal F_t)_{t\in[0,1]}\) generated by uncountably many independent coordinates while preserving standard measurability properties (right-continuity, separability, etc.).

This is why stochastic processes are always built from countable generating sets, even when indexed by continuous time.

3.4 Canonical example: product Lebesgue space

Take \[ (\Omega,\mathcal F,\mathsf P) = ([0,1]^{[0,1]}, \text{product }\sigma\text{-field}, \lambda^{\otimes [0,1]}). \]

Each coordinate projection \(X_t(\omega)=\omega(t)\) is \(\mathrm{Unif}(0,1)\).
The family \((X_t)_{t\in[0,1]}\) is independent.
But there is no measurable map \(t\mapsto X_t(\omega)\) for almost every \(\omega\).

Thus, independence exists coordinatewise, but not process-wise.

3.5 The sharp boundary and an important intuition

Countable independence behaves like algebra. Uncountable independence behaves like logic.

Probability theory is built so that:

all real constructions reduce to countable operations;
uncountable collections exist only as projective limits, not as concrete random objects.

This is not a technical inconvenience—it is why:

Brownian motion has correlated increments,
white noise is a distribution, not a function,
filtrations are required to be separable,
simulation always reduces to countable randomness.

3.6 Bottom line

✔ You can define a continuum of independent random variables as coordinate projections on a product space.
✘ You cannot treat them as a jointly measurable random field.
✘ You cannot view them as a “random function” of the index.
✔ All practical stochastic modeling avoids this by enforcing separability or correlation.
Mental model: Uncountable independence exists formally, but never operationally.

4 The diagonal is not measurable and non-separability

4.1 The diagonal pathology

Take the continuum product space \[ (\Omega,\mathcal F,\mathsf P) = ([0,1]^{[0,1]},;\mathcal B([0,1])^{\otimes [0,1]},;\lambda^{\otimes [0,1]}), \] and define coordinate projections \(X_t(\omega)=\omega(t)\). Consider the diagonal event \[ D := \{\omega : X_s(\omega)=X_t(\omega) \} \quad (s\neq t). \] Important facts:

For fixed \(s\neq t\), the set \(D\) is measurable and has probability zero.
But the set \[ \{\omega : t\mapsto X_t(\omega)\ \text{is measurable}\} \] is not measurable (and in fact has outer probability zero).

So the diagonal itself is not the main culprit—it’s the uncountable union/intersection structure behind “pathwise” statements that fails measurability.

4.2 Where measurability actually breaks

To say that \((X_t)_{t\in[0,1]}\) is a “process” in the usual sense requires that the evaluation map \[ (\omega,t)\mapsto X_t(\omega) \] be measurable with respect to \(\mathcal F\otimes\mathcal B([0,1])\).

That fails.Why? Because joint measurability would imply that for almost every \(\omega\), the section \[ t\mapsto X_t(\omega) \] is Borel measurable. But for a family of independent, non-degenerate variables indexed by a continuum:

With probability one, every section is non-measurable.

Again, this is a theorem, not folklore.

4.3 The real obstruction: non-separability

4.3.1 Countable vs uncountable generators

A standard probability space is generated (mod null sets) by a countable family.
Product spaces indexed by an uncountable set are not standard.
The product \(\sigma\)-algebra is too small to control diagonal-type events but too large to admit separability.

Equivalently:

You cannot approximate uncountably many coordinates using countably many measurable tests.
There is no countable determining class.
Hence no separable modification exists.

This is why all “good” stochastic processes (Brownian motion, Lévy processes, semimartingales) are built so that:

the index set is uncountable,
but the randomness is generated by a countable dense subset (e.g. rationals),
and everything else is obtained by completion or continuity.

Independent noise at every point destroys this structure.

4.4 Relation to diagonals more generally

The diagonal intuition:

In countable products, diagonals are harmless.
In uncountable products, diagonals encode equality across an uncountable index set.
Such events require control of uncountably many coordinates simultaneously.
The product \(\sigma\)-algebra cannot “see” them properly.

This is the same phenomenon behind:

failure of Fubini–Tonelli without separability,
existence of non-measurable sections,
impossibility of white noise as a function-valued random variable.

4.5 Bottom line

Uncountable independence is compatible with probability, but incompatible with paths.

More precisely:

You can have a continuum of independent coordinates.
You cannot have a jointly measurable random field with those coordinates.
The obstruction is not probability zero events, but non-separability of the index.
Diagonal pathologies are a symptom, not the disease.

This is why:

Brownian motion has correlated values but separable paths,
white noise exists only as a generalized random object,
filtrations are always required to be “usual” (right-continuous, complete, separable).

References

Follmer, H., Schied, A., Föllmer, H., Schied, A.: Stochastic Finance: An Introduction in Discrete Time. Walter de Gruyter, Berlin, Boston (2016)