Suppose we repeat an experiment n times ("n trials"). The experiment produces two possible outcomes: success and failure, or {S,F}. The outcomes of each trial are independent. Denote P(S)=p, which is the same for all trials.
Let X be the number of successes among the n trials. Then we call these n trials a binomial experiment and X a binomial random variable (X follows bin(n,p) with parameters n,p).
Ex. X follows bin(4,p). Find the probability mass function for X.
Possible values of X: 0, 1, 2, 3, 4.
p(0)=P(X=0)=(1−p)4
p(1)=P(X=1)=(14)⋅p(1−p)3=4p(1−p)3
We multiply by 4 above because there’s four possible ways to choose one success out of the four trial: (14).
p(2)=P(X=2)=(24)⋅p2(1−p)2=6p2(1−p)2
p(3)=P(X=3)=(34)⋅p3(1−p)1=4p3(1−p)
p(4)=P(X=4)=(44)⋅p4=p4
In general, if X follows bin(n,p), then P(X=k)=(kn)⋅pk(1−p)n−k.
Theorem. If X follows bin(n,p), then the probability mass function for X is b(k;n,p)={(kn)⋅pk(1−p)n−k00≤k≤notherwise
The cumulative distribution function for X is B(x;n,p)=y=0∑[x]b(y;n,p) for x>0.
Note that there is no closed form for this cdf – the easiest way to calculate it is to just work out each value of b by hand and sum them. However, you can use a binomial table to simplify the calculations.
Expected Value and Variance
Theorem. If X follows bin(n,p), then E(X)=np and V(X)=np(1−p).
Now that we understand the binomial distribution, we can draw some conclusions about its properties and how it relates to the Bernoulli distribution:
- Bernoulli(p)=bin(1,p).
- Suppose we performed a sequence of independent experiments Xi that follow Bernoulli(p), where Xi denotes the success or failure of the ith experiment. Then a new random variable formed by summing these outcomes follows bin(n,p).
- Suppose X1 follows bin(n1,p) and X2 follows bin(n2,p). Then Y=X1+X2 follows bin(n1+n2,p).