Full disclosure: Drew Conway is an official advisor to Yhat, and is compensated for his time.
Often, the best things in life occur when you can mix two things you love. Today, I want to talk about how I have recently been doing exactly this by mixing a long time love; fantasy football, with a more recent discovery: Yhat.
I had the good fortune of meeting Austin and Greg, the Yhat founders, several months ago for a breakfast meeting. Over coffee and blueberry pancakes we discussed what they were building. In classic entrepreneurial spirit, they were trying to address a pain point they themselves had faced several times: deploying predictive models into production systems.
The problem is simple, those writing predictive models (data scientists) like to use the tools they know well and that work well for that task; usually R, Python, or even Excel. Likewise, those creating production systems (devops) create those systems using tools they know well and that work well for that task; usually Java, C#, Ruby, etc. Unfortunately, these tools do not like to work well together. There are many ways to solve this problem, and if you ask 100 different data scientists at 100 different companies they will each give you a different answer. Many (maybe all) will involve some hacked together work-flow that at some point required the data scientist to waste a huge amount of time building something outside their core competency.
This sucks, Yhat gets that, and I love them for that.
Yhat provides the glue that allows data scientists to push their models, or model results, directly into production systems using the tools they already know. When Austin and Greg told me this I could envision all 100 of those frustrated data scientists suddenly freed from the bonds of their duct-taped together solutions and hours wasted learning web stacks.
Since then, I have been working with the Yhat team to help them reach their vision. As the calendar turned to August I also started working on another project, analyzing NFL players for my many upcoming fantasy football drafts. It is no secret that I have used a quantitative approach to fantasy football in the past. One of the problems facing a fantasy football manager during a draft is assessing when in a draft a player should be drafted. If I draft a player too early I have over-valued that player, and may suffer an opportunity cost. If I wait too long, another manager might draft them instead and thus under-valued that player. Therefore, my strategy this year has been to create the best predictive model of when a player should be drafted.
As I began working through this process it occurred to me that it would be even better if I could test a bunch of models, and see how those predictions compared to other models. Moreover, the value of players fluctuates greatly day-to-day due to injuries, contract negotiations, or performance in training camp and pre-season. To do this, however, I would have to create a work-flow that allowed me to push my predictions to a system that could then test them against the most recent mock-draft data.
But I want to spend my time building models, not building a test system. Enter Yhat, and the Yhat Fantasy Football Challenge.
Yhat Fantasy Football Challenge
The team created the Yhat Fantasy Football Challenge. The idea is pretty simple: write a model that can best predict the outcome of a 10 team x 15 player draft by using mock-draft data provided by FantasyFootballCalculator.com. Use the Yhat API to push the results of your model to Yhat, and they'll test it for you and provide real-time feedback on the results.
Yhat provides the data, templates for pushing your results to Yhat using both R and Python, and a production system to test your predictions. All you have to do is sign-up for a free Yhat API key, and start creating models.
Example model push
One of the great things about having a rapid deployment process for predictive models is it allows you to test, test, re-test, and then test some more. In this case, to push a model to Yhat all I need to do is:
- Get the data
- Write a model for predicting the outcome of a 10x15 draft in R or Python
- Create an ordered list of
playeridvalues and store it as the
- Deploy model to Yhat using the
- Rinse, and repeat
Because this process is so easy, I can try even the most ridiculous models. An example of a ridiculous modeling strategy, and one that I would never actually use but am happy to post here, is what I call the Igon-drafter Model, named for Malcolm Gladwell's silly misspelling because it is an equally silly strategy.
In this model I represent the training data, in this case the previous 30 days of mock draft data, as player-by-draft position matrix, wherein the rows are players, columns are draft positions, and each cell contains the number of times each player was drafted. Today, this data looks like this:
Through the good fortune of alphabetical sorting, the first 10 rows contain two well known players: Adrian Peterson and Aaron Rodgers. Here we can see that the former was drafted 7,350 times as the first overall pick, while the latter only 19 times. And somehow, in three drafts, A.P. dropped to the fifth pick!
Computing the Igon-model in R is a straightforward combination of matrix operations:
Notice that in this case I am using the second eigenvector as the
score value. The results from the leading eigenvector were so out of whack it is not even worth posting, but interestingly the second vector produces something that, if you follow football, makes some sense:
The first two rounds, or first twenty picks, look good and follow with my intuition. Then, something goes terribly wrong in at the beginning of the third round. Needless to say, picking Mark Sanchez in the middle of the third round would be an offense punishable by exile in most fantasy leagues!
But, with Yhat it is totally costless for me to deploy this and test it, so why not run it up the flagpole? In this case I just need to produce an ordered list of 150
playerid values based on the ranking produced by the Igon-model. In my work-flow, I have created two files:
League format script creates a data structure that keeps track of the draft as it happens. This is generally needed because there are restrictions on the number of player positions that a team can draft based on the format of a fantasy football league. In this case, we are using a fairly standard league format.
The drafting function then takes the ranks of players in
training.probs and produces a draft using a “best available” strategy for all teams that are trying to fill out their rosters based on the league format.
Running this script produces our ordered list
draft.sequence, which I can now use to deploy this model to production is test it using the template provided by Yhat.
Template for deploying model results with R.
Now I can just head over to http://yhat-fantasy-football.herokuapp.com/ to see how well the Igon-model is doing.
Spoiler alter: it does terribly!
After setting this up, we realized a leader board is no good without a contest. So Yhat Fantasy Football Challenge has the following stakes:
Those individuals, not from the Yhat team, who push the 9 best performing models will be invited to join the Yhat Fantasy Football League. The tenth spot in the team will be a co-managed Yhat team.
Good luck! Time for me to get back to some more serious modeling strategies…