Preface: This example is a (greatly modified) excerpt from the open-source book Bayesian Methods for Hackers, currently being developed on Github ;)
How to solve* the Showdown on the Price is Right
*I use the term loosely and irresponsibly. Thanks to Allen Downey, of Think Bayes for pointing out some original errors.It is incredibly surprising how wild some bids can be on The Price is Right's final game, The Showcase. If you are unfamiliar with how it is played (really?), here's a quick synopsis:
- Two contestants compete in The Showcase.
- Each contestant is shown a unique suite of prizes (we'll assume two prizes per suite for brevity, but this can be extended to any number).
- After the viewing, the contestants are asked to bid on the price for their unique suite of prizes.
- If a bid price is over the actual price, the bid's owner is disqualified from winning.
- If both bids are over, then the closer bid's owner wins.
- Else the winner is the owner of the closer bid to the true prices.
Bayesian Philosophy
Bayesian inference differs from more traditional statistical analysis by preserving uncertainty about our beliefs. At first, this sounds like a bad statistical technique. Isn't statistics all about deriving certainty from randomness? I'll explain.
The Bayesian method believes that probability is better seen as a measure of believability in an event. More formally, Bayesians interpret a probability as measure of *belief* of an event occurring. A belief of 0 is you have no confidence that the event will occur; conversely, a belief of 1 implies you are absolutely certain of an event occurring. Beliefs between 0 and 1 allow for weightings of other outcomes.
This philosophy of treating beliefs as probability is natural to humans. We employ it constantly as we interact with the world and only see partial evidence but have to make intelligent decisions. To align ourselves with traditional probability notation, we denote our belief about event $A$ as $P(A) \in [0,1]$.
John Maynard Keynes, a great economist and thinker, said
"When the facts change, I change my mind. What do you do, sir?"This quote reflects the way a Bayesian updates his or her beliefs after seeing evidence. Even -especially- if the evidence is counter to what was initialed believed, it cannot be ignored. We denote our updated belief as $P(A |X )$, interpreted as the probability of $A$ given the evidence $X$. We call the updated probability the posterior probability so as to contrast the pre-evidence prior probability.
By introducing prior uncertainty about events, we are already admitting that any guess we make is potentially very wrong. After observing data, evidence, or other information, and we update our beliefs, our guess becomes less wrong. This is the opposite side of the inference coin, where typically we try to be more right.
Modeling the Showcase
There are two reasons why Bayesian inference is the correct methodology for the Showcase problem. The first was explained above: we do not have complete information about the prices in the prize suite, but we do have beliefs about what the prices might be. Also, we have a prior belief about what the final price might be: we can look at historical final prices and derive a suitable prior distribution.
Suppose, then, that the historical final prices have been about Normally distributed: $$ \text{final_price} \sim N( 35000, 7500 ) $$ This is our prior probability distribution of the final price.
Similarly, as the Showcase is revealed, we have beliefs about what the prices of the items might be (remember we are only considering two prizes initially). We can model the individuals prizes again by Normal distributions, with parameter $\mu_i$ and $\sigma^2_i$. This is very realistic and flexible, as we have a likely guess about the price ($\mu_i$) and we can also express our uncertainty in our guess why changing $\sigma^2_i$.
Let's take a step back. This is pretty cool. In other statistical models we can't specify individual beliefs about things like how uncertain we are, or what our prior opinions might be.
Suppose the prize suite contains a snowblower and a trip to Toronto, Ontario (I guess they hope they will use the snowblower on their vacation).
Personally, I would assign the following distributions to the prizes:
\begin{align*}
&\text{snowblower} = \text{Normal}(3000, 500 ) \\\\
&\text{trip} = \text{Normal}( 12000, 3000 )
\end{align*}
and we know that the relationship holds:
$$ \text{final_price} = \text{snowblower} + \text{trip} + \epsilon $$Out next step is to find our posterior distribution of the final price given we have seen the prizes. Normally, this step would involve terrible, cramp-inducing mathematical integrals, but in Probabilistic Programming and Bayesian Methods for Hackers we demonstrate that this approach is not necessary. In fact, the book presents Bayesian inference via a computational/understanding-first, mathematics-second point of view. But I advertise. Let's show how it is done without integrals and hand-cramps.
We will by using the PyMC library to find the posterior distribution. The code is pretty self explanatory in light of the above discussion:
import pymc as mc
mu_prior = 35000
std_prior = 7500
final_price = mc.Normal( "final_price", mu_prior, 1.0/std_prior**2 )#bayesian's use 1/sigma**2 as the parameter
snowblower = mc.Normal( "snowblower", 3000.0, 1.0/500.0**2 )
toronto = mc.Normal( "toronto", 12000.0, 1.0/3000.0**2 )
price_estimate = snowblower + toronto
@mc.potential
def error( final_price = final_price, price_estimate = price_estimate ):
return mc.normal_like( final_price, price_estimate, 3e3**2)
#start sampling from the posterior
model = mc.Model([final_price, toronto, snowblower, price_estimate, error ])
mcmc = mc.MCMC(model)
mcmc.sample( 220000, 180000)
price_trace = mcmc.trace( "final_price" )[:]
The important result is contained in price_trace at the end. It is an array of samples from the posterior distribution.
So we do not get back an analytical formula of the posterior probability distribution (which often doesn't analytically exist), we
are returned samples (which can be plotted into a histogram to observe the shape of the distribution).
Notice how our final price estimate, which use to be centered at 35 thousand, is now closer to 28 000. This reflects what our data (i.e. observed prizes) are suggesting: the final price is lower. But we have not completely discarded our prior, as this is still information we are using: our prior reflects that often prizes, for whatever reason, have historically cost more than our own current beliefs suggest. Hence we have a great balance between objectiveness and subjectiveness.
We still need to make a bid. We need a way to choose a *good* bid using this posterior distribution. A naive Bayesian would choose the mean of the posterior, but we can do soooo much better. The following is the second reason why Bayesian inference is the right way to approach this problem
Making a better bid: Not going over
Instead of using the mean of the posterior as our bid, we should choose a bid more intelligently. We understand that the Showcase has a unique payoff: if a bid is over the true price, the keys to winning are essentially handed over to the other contestant; if the bid is not close to the true price (without going over), too much room might be left for the other contestant to guess closer. We can (approximately) quantify this.
A loss function, $L$ is a function that accepts the truth and an estimate of the truth, and returns a number that reflects the outcome of that estimate. The larger the loss returned, the worse the outcome. For example, $$L( \theta, \hat{\theta} ) = (\theta - \hat{\theta})^2$$ is a loss function that is increasingly large and minimized only when $\hat{\theta} = \theta$. In Bayesian inference, we are free to choose our own loss function to reflect the outcomes of our estimates:
def showcase_loss( bid, true_price, pain = 80000):
if true_price < bid:
return pain
else:
return np.abs( true_price - guess )
The interpretation of this loss function is that if we bid over the true price, we are penalized heavily with the parameter pain.
A lower pain means that you are more comfortable with the idea of going over (remember going over does not guarantee you lose--
your competitor might go over as well). If we do bid under the true price, we want to be as close as
possible, hence the else loss is an increasing function of the distance between the bid and true price.
But we don't know what the true price is...
Right. We don't know what the true price is, call it $p$. In fact, we have a whole distribution of what the true price might be, aka the posterior distribution. Hence, we have to look at an expected loss as a function of the bid, $\hat{p}$: $$\ell(\hat{p}) = E_{p}[ L( p, \hat{p} ) ]$$ We can approximate the expected loss by using the samples from the posterior. $$\frac{1}{N} \sum_{i=0}^N L( p_i, \hat{p} ) \approx E_{p}[ L( p, \hat{p} ) ]$$ We can vary our bid and see what the expected loss is, varying the pain parameter too:
Of course, for a certain pain parameter, we would want to minimize our loss. We use Scipy's optimization routines to find the minimums per curve. These points provide the best (with respect to our unique loss) bid for the Showcase.
Note how far away our optimized bids are even though the posterior mean is 28 thousand. This is because, for any bid we make, there is still a chance that we will exceed the true price, something we decided we really do not want to happen (we decided this through our loss function above).
Conclusion
This example really shows off the power of Bayesian methods. We started with a very uncertain estimation problem, but with have a flexible framework that allows us to be uncertain, we were able to update our uncertainty (posterior) and be less wrong. We then optimized our bid by investigating the expected loss to find the best bid. We can probably derive a more rational loss, or at least be more intelligent to choosing a good pain parameter.
I encourage you to check out Probabilistic Programming and Bayesian Methods for Hackers in Python for more examples and tools to become less wrong.
The hyper-intelligent Allen Downey, of Think Bayes, has also created a blog-post about this subject, implementing it in his Python framework.
References
- Davidson-Pilon, C. et al, Probabilistic Programming and Bayesian Methods for Hackers in Python
- Fonnesbeck, C. et al,PyMC
Other articles to enjoy:
- Multi-Armed Bandits
- Machine Learning counter-examples
- How to solve the Price is Right's Showdown
- An algorithm to sort "Top" Comments