FS1 - Probability Generating Functions - Sums of Random Variables

Let X and Y be discrete random variables. We might be interested in the distribution of X + Y. For convenience, we will focus on the case where X and Y are independent. Throughout, a, b and c are non-negative integers. (saves me having to reintroduce them constantly)

Since (as we mentioned in the previous section) a probability generating function is unique to a distribution, if we can find a way to compute the probability generating function of X + Y, we can try comparing it to the probability generating functions of known distributions to deduce its distribution. This (and similar techniques) can allow us to deduce certain key properties of distributions. For example - see exercise 2 at the bottom of this section.

Let Z = aX + bY (for neatness) and let G_X, G_Y, G_Z be the probability generating functions of X, Y, Z respectively. We have quite a nice relation between these functions:

\displaystyle G_X(t^a) G_Y(t^b) = G_Z(t)

for all t for which all of these functions are defined.

We will use the following facts without proof.

  • If X and Y are independent random variables then \displaystyle \mathrm E(XY) = \mathrm E(X) \mathrm E(Y)

  • if X and Y are independent random variables and f is a sensible function then f(X) and f(Y) are independent.

Hopefully these two results are fairly intuitive.

We have, putting these facts together:

\begin{align*}G_Z(t) & = \mathrm E(t^Z) \\ & = \mathrm E(t^{aX + bY}) \\ & = \mathrm E(t^{aX} t^{bY}) \\ & = \mathrm E(t^{aX}) \mathrm E(t^{bY}) \\ & = \mathrm E\left((t^a)^X\right) \mathrm E\left((t^b)^Y\right) \\ & = G_X(t^a) G_Y (t^b)\end{align*}

You will not be required to recall this proof, but you will be expected to know this fact.

A similar result is that if Z = X + c, we have:

\displaystyle G_Z(t) = t^c G_X(t)

See if you can prove this - start by writing the left hand side as an expectation.

We will finish off with a theorem you may have already seen. Let X and Y be independent random variables, then:

\displaystyle \mathrm E(aX + bY) = a \mathrm E(X) + b \mathrm E(Y)

provided these expectations all exist.

We prove this in the case that a, b are non-negative integers.

We have already proved that:

\displaystyle G_Z (t) = G_X (t^a) G_Y(t^b)

where Z = aX + bY.

Note that by the chain rule we have:

\displaystyle \frac {\mathrm d} {\mathrm dt} \left(G_X(t^a)\right) = at^{a - 1} G'_X(t^a)

and similarly:

\displaystyle \frac {\mathrm d} {\mathrm dt} \left(G_Y(t^b)\right) = bt^{b - 1} G'_Y(t^b)

We therefore have, by the product rule:

\displaystyle G'_Z(t) = at^{a - 1} G'_X(t^a) G_Y(t^b) + bt^{b - 1} G'_Y(t^b) G_X(t^a)

This is a bit of a mess, but if we just set t = 1:

\displaystyle G'_Z(1) = a G'_X(1) G_Y(1) + b G'_Y(1) G_X(1)

We know that G'_X(1) = \mathrm E(X), G'_Y(1) = \mathrm E(Y), and G_X(1) = G_Y(1) = 1, so:

\displaystyle G'_Z(1) = \mathrm E(Z) = \mathrm E(aX + bY) = a \mathrm E(X) + b \mathrm E(Y)

and we are done.


  • Let X be a discrete random variable with finite expectation and probability generating function G_X. Show that the probability generating function of Z = X + c is given by G_Z(t) = t^c G_X(t). Deduce that \mathrm E(X + c) = \mathrm E(X) + c.
  • Let X and Y be discrete and independent with finite variances. Prove that \mathrm {var}(a X + bY) = a^2 \mathrm{var}(X) + b^2 \mathrm {var}(Y).
  • Prove using probability generating functions that if X,Y are discrete random variables with X \sim \mathrm {Po}(\lambda) and Y \sim \mathrm {Po}(\mu), then aX + bY \sim \mathrm{Po}(a\lambda + b\mu). You don’t have to re-prove the probability generating function for the Poisson distribution.
1 Like

:star: :star: :star: :star: :star:

FWIW - if you want to see a proof of the two facts used here in the case X, Y discrete: https://proofwiki.org/wiki/Condition_for_Independence_from_Product_of_Expectations. This is beyond A-level so don’t worry if you don’t get parts.

1 Like