As you may know, probability questions are frequently tested in the technical interviews for Data Scientist, Data Analyst even ML Engineer. The test is usually focused on conditional probability (application of Bayes’ Rule), counting and PDF with various probability distribution (normal distribution, binomial distribution, Poisson distribution etc..). In the interview, you are not just expected to solve the puzzle silently. Interviewers want to know your thought process and they are more likely to test your knowledge in the conversation.
Therefore, the keys to crack those interviews are 1) to have good probability fundamentals and 2) articulate clearly your approach and what makes you choose this approach.
Feeling like you have to memorize all those math formula from school days? Not sure how to organize and communicate your thoughts to the interviewer? Do not worry! In this article, I selected 3 typical probability questions (also very popular ones that are voted by people who claimed they have seen the questions from FAANG interviews) and provided you a step-by-step ‘thinking and talking points’ template, so you can learn about some common approaches used to solve those questions and how to communicate them to the interviewers.
As a warm up,
Let’s begin with a simple question as below.
Question 1 — From a deck of 100 cards with number 1 to 100, you pick 2 cards, what is the probability that one of the number is twice of the other?
Thought processes in general:
- As probability is like a ratio, always be clear what is the numerator and what is the denominator for the calculation.
- Enumerate the examples. Sometimes there could be countless enumerations. That is ok because we try to start with some examples to search for any patterns that we can recognize.
- Generalize into an expression (depends on the problem, sometimes counting from enumerations is sufficient to solve the problem, so we do not need this step)
Applying the guidance to this question, here are the steps you could follow to lead to the answer.
Step 1: consider what is the numerator and what is the denominator.
In this question:
numerator: the number of ways one number is twice of the other if you pick 2 cards from 100 cards.
denominator: total number of ways you pick 2 cards from 100 cards.
Step 2: enumerate all the cases.
Let (a,b) be a pair of cards where
- “a” stands for the number for the first card
- “b” stands for the number for the 2nd card
The valid examples are shown as below.
(1,2)
(2,4)
(3,6)
(4,8)
(5,10)
…
(50,100)
Step 3: Generalize into an expression
As you can see, we do not need to list all examples. Just by observing the pattern, it is pretty straight forward to conclude that there are 50 ways of picking cards. Also always remember to note if order of the two events matters. In our case, both (1,2) and (2,1) should be valid and same applies to (2,4), (3,6)… etc, therefore there are totally 50*2 = 100 ways that one number is twice of the other. The numerator is 100.
for the denominator: you pick one card, there are 100 ways of doing so, then you picked from the remains 99 cards. So totally there are 100*99 = 9900 ways you pick 2 cards from 100 cards.
Answer: 100/(100*99) = 1/99
Counting is not that difficult right?
OK. Now let’s try on a different problem for rolling a die during the interview.
Question 2 — If you roll a die 3 times. What is the probability of two 6s in a row?
Step 1: consider what is the numerator and what is the denominator
numerator: the number of ways two 6s in a row if you roll a die 3 times.
denominator: total number of ways to roll a dice 3 times.
Step 2: Enumerate the cases
For the numerator: in order to get two 6s in a row. There are 3 scenario
- 6 for all three times
- two 6s for the first two times
- 6s for the last two times
Step 3: Generalize into an expression
Similarly to the question 1. I will denote the rolling result as (a,b,c) for simplicity
- “a” stands for result of the first number
- “b” stands for the 2nd number
- “c” stands for the third number
Scenario 1: (6,6,6) there is only One way of doing so.
so (6,6,6) -> 1 way
Scenario 2: 6, 6 and something else ( this number can be 1, 2,3..5, but not 6)
so (6,6,?) -> 5 ways
Scenario 3: something else (this number can be 1,2,3…5 but not 6), 6, 6 ->5 ways
so (?,6,6) -> 5 ways
Therefore totally there are 1+5+5 = 11 ways of two 6s in a row to roll a die three times.
Denominator:
Each time you roll a die, the result could be one of the 6 result, so totally there are 6*6*6= 216 ways
Answer: the possibility of two 6 in a row is 11/216
For question 1 and 2, we can easily find the answer by counting and enumerating all examples.
Finally, a more difficult question when enumeration all cases does not work.
Question 3: If you roll a die 3 times, what is the probability that the numbers come in increasing number? Ex:(1,2,4) is a valid case, but (3,6,5) is not.
Step 1: consider what is the numerator and what is the denominator?
numerator: number of ways the numbers come in increasing number if you roll a die 3 times.
denominator: total number of ways to roll a die 3 times.
Step 2: enumerate all the possibilities and search for the patterns.
First enumerate some more valid cases ourselves, it’s better to start with 1. Below are some examples:
(1,2,3)
(1,2,4)
(1,2,5)
(1,2,6)
(1,3,4)
(1,3,5)
(1,3,6)
…
Apparently it is not practice to generate all the cases. We need to find another method.
Now we can ask ourself question “what are the criteria to get a valid sequence?” If we roll three distinct number there are 6*5*4 = 120 ways. Among the 120 ways, how many of them are in increasing order?
Even one step back. Given by 3 distinct random numbers, how many possible sequence are there? I use number (1,2,3) as example and here are all the combinations.
(1,2,3) -> increasing order
(1,3,2) ->no order
(2,1,3) ->no order
(2,3,1) ->no order
(3,1,2) ->no order
(3,2,1) -> decreasing order
Among the 6 ways, only the first way is in increasing order so the valid case ratio is 1/6. That is to say, with any given 3 random distinct numbers, the probability of having strictly increasing order is 1/6 (probability of having strictly decreasing order is 1/6 too).
Step 3: generalize the pattern we observed into an expression
We use ratio of 16 to multiple the 12o ways. The result is 120*(1/6)= 20 ways, hence 20 should be the numerator.
Similar to the question 2, the denominator is 6*6*6= 216 ways
Answer: 20/216 = 5/54
Conclusion
I hope you have gained much more confidence after practicing the 3 questions with me. You could encounter the questions with different variations in the reality. Just remember the interviewers would like to know your thought processes more than memorizing the formula. In most of the time, the fundamental math knowledge should be sufficient and its the communication as well as the capability of problem solving make you outstanding from the other candidates.