Using market valuations in an NFL Pick’em Pool

This Fall, I’m playing in an NFL Pick’em contest with some friends from college. Each week, I predict the outcome of every NFL game according to Yahoo’s point-spread. For example, during Week 1 when the Patriots hosted the Texans, Yahoo assigned the Texans a seven-point handicap. My prediction came down to whether the Patriots would beat the Texans by more than seven points, or whether I thought the Texans would lose by no more than seven points (or win outright). This post is about how I will be using data to make each of the 256 predictions over the course of the season.

I remembered reading a blog post by then-statistics professor Michael Lopez about evaluating the outcomes of sporting events against the gambling market.

I’d also read about Michael Beuoy’s Inpredictable, which publishes team rankings based upon the implications of the sports wagering market. The thesis is that the wagering market is an efficient enough economy to derive the relative strengths and weaknesses of each team.

The methodology for the implied rankings is on the Inpredictable, but the basis for the rankings is on Generic Points Favored (GPF). That is, the number of points a team would be favored by against an average opponent on a neutral field.

With a fairly minimal amount of code, I have been using this data to enter my predictions each week. This is a brief overview of how to do so:

Scrape the rankings table from Inpredictable

This creates a dataframe that looks like this:

Team	GPF	oGPF	dGPF	W-L	Projected Seed	pSOS	fSOS
LA	7.9	5.6	2.3	3-0	0.996	-3.2	-0.4
KC	5.0	8.1	-3.2	3-0	0.936	3.1	-0.1
LAC	4.4	2.8	1.5	1-2	0.699	2.9	-1.3
PIT	4.3	3.2	1.1	1-1-1	0.538	1.9	1.3
BAL	4.1	2.3	1.8	2-1	0.667	-2.0	1.3
NE	3.7	3.3	0.4	1-2	0.717	1.4	-1.0

Now that I have the up-to-date market-based team ratings, all that’s left is to compare the ratings of teams that are playing each other in the coming week. To do so, I loaded a schedule table, which looks like this:

Week	Date	Time	Away	Home
1	September 6	8:20 PM	ATL	PHI
1	September 9	1:00 PM	PIT	CLE
1	September 9	1:00 PM	CIN	IND
1	September 9	1:00 PM	TEN	MIA
1	September 9	1:00 PM	SF	MIN
1	September 9	1:00 PM	TB	NO

I combined the ratings with the schedule data using the dplyr package, and calculated each matchup’s offensive, defensive, and overall ratings delta:

This applies the market ratings to all matchups throughout the season. At this point, the table looks like this:

Date	Away	Home	Favorite	Margin
September 20	NYJ	CLE	NYJ	0.6
September 23	NO	ATL	ATL	0.1
September 23	CIN	CAR	CAR	0.1
September 23	NYG	HOU	HOU	1.8
September 23	TEN	JAC	JAC	6.4
September 23	SF	KC	KC	7.0

Since I am only interested in the coming week, I created a visualization with ggplot2 to quickly illustrate the predictions I will make. I wrap this into a function called getMarketRatings, which takes the NFL Schedule week as an input, and provides a match-up ratings plot, and table as output:

The table is displayed within RStudio, while the plot is saved to my computer, and looks like this:

Alt text