Definition. A discrete random variable follows the Poisson distribution with parameter () if its probabilty mass function is for
The cumulative distribution function for has no closed form; you should use a Poisson table to calculate it.
Observation. Recall that the Taylor series expansion of is Using this fact, we see that the sum of all possible values of is which is what we’d expect for a useful distribution.
Proposition. Suppose follows . As , , and , then converges towards the Poisson distribution .
(Rule of thumb: When and , we can use the above proposition to approximate with and vice-versa.)
The Poisson distribution can be used to model the number of random visits/arrivals that will occur in a fixed period of time.
Ex. 100 people pass by a store in 10 minutes. The probability that any one enters the store is 0.02. If is the number of people that entered, then follows .
Following our rule of thumb, we can approximate the probability mass function of with .
Suppose follows . Then and . This interesting property is another reason we study the Poisson distribution.
Suppose a collection of random variables all follow . Think of them as representing the number of visits that occur in a sequence of time periods – they are independent.
If we sum all to produce a new random variable , then also follows , where is the sum of all . (Binomial distributions also have this property.)
Suppose follows and represents the number of visits from time 0 to time , and follows and represents the number of visits from time to time . (Note that and are durations of time, not points in time.)
Then , the number of visits from time 0 to time , is simply and follows , from the sum property described above.
Definition. Given parameter , suppose visits occur at times
Let be the number of visits from time to time , with . In other words, .
If follows , then we call the collection of a Poisson process. Here, is the parameter of the Poisson process, also called the rate of the process.
From the observations we’ve made before, we see that ; the sum of visits of two consecutive intervals of time is the same as visits made over the union of those intervals, which makes sense intuitively. This can also be expanded to sum over any number of intervals.
If are arbitrary discrete random variables, then since is linear.
In addition, if these variables are all independent and identically distributed, then