Recall our definition of conditional probability from the last lecture: P(A∣B)=P(B)P(A∩B), P(B)>0.
If we know P(A∣B) and P(B), we get P(A∩B)=P(A∣B)⋅P(B).
Applying this fact twice, P(A∩B∩C)=P(A∣B∩C)⋅P(B∩C)=P(A∣B∩C)⋅P(B∣C)⋅P(C).
Ex. Using the example from last time:
10012 customers bought diapers (P(C)). Among them, 1210 bought milk products (P(B∣C)).
Additionally, among customers that bought both diapers and milk products, 108 of them bought cribs (P(A∣B∩C)).
What is the probability that someone bought all three items? P(A∩B∩C)=108⋅1210⋅10012=1008.
Ex. In a store, 50% of DVDs sold are from Brand 1, 30% are from Brand 2, and 20% are from Brand 3. Among them, 25% of discs from Brand 1 will break within a year, as will 20% from Brand 2 and 10% from Brand 3.
a) What is the probability that a DVD sold from this store needs repair within a year? A1={Brand 1}, A2={Brand 2}, A3={Brand 3}B={need repair in a year}P(B)=P(B∩A1)+P(B∩A2)+P(B∩A3)=P(B∣A1)⋅P(A1)+P(B∣A2)⋅P(A2)+P(B∣A3)⋅P(A3)=10025⋅10050+10020⋅10030+10010⋅10020=0.125+0.06+0.02=0.205.
b) Given a DVD that breaks within a year, what is the probability that it was made by each brand? P(A1∣B)=P(B)P(A1∩B)=P(B)P(B∣A1)⋅P(A1)=0.2050.25⋅0.5≈0.61. In general, P(Aj∣B)=P(B)P(B∣Aj)⋅P(Aj).
Definition. The event {A1...Ak} is exhaustive if one of Ai must occur, i.e. S=A1∪A2...∪Ak.
Let {A1...Ak} be exhaustive and exclusive (all of its members are disjoint with each other). Then, given another event B, P(B)=P(B∣A1)P(A1)+P(B∣A2)P(A2)+...+P(B∣Ak)P(Ak)=i=1∑kP(B∣Ai)P(Ai).
We implicitly used this rule to calculate P(B) in the previous example, because the three brands were exhaustive (you couldn’t buy any other brands of DVDs at the store) and exclusive (each DVD was made by only one brand).
Using the same constraints on {A1...Ak}, Bayes’s theorem states:
P(Aj∣B)=P(B)P(Aj∩B)=∑i=1kP(B∣Ai)P(Ai)P(B∣Aj)P(Aj).
Here P(Aj) is called the prior probability of event Aj, and P(Aj∣B) is called the posterior probability of event Aj given that B occurred.
Definition. Two events A and B are independent if P(A∣B)=P(A). Otherwise, we say they are dependent.
The intuition for this is that knowing whether B did or didn’t occur doesn’t give you any more information about the chance of A occurring.