With the Super Bowl XLVI coming up, there has been much debate over the two starting quarterbacks, Tom Brady and Eli Manning and whether or not both are considered “elite”. Tom Brady, without a doubt has the career stats to back up the elite label. His touchdowns, passing yards, quarterback ratings, and almost all other stats easily eclipse that of Eli Manning. Brady ranks right up there with other quarterback greats such as Aaron Rodgers, Drew Brees, and Brett Favre. But regardless of career stats, when questioned about his place in NFL history, Eli Manning himself said he was “elite”.
This argument gave me the idea of comparing stats of the two quarterbacks for this weekend’s Super Bowl. But rather than look at overall career totals/averages (because we all know Brady’s overall stats will reign supreme), I decided to try to compare their QB stats for only “in the clutch”. When I say “in the clutch”, I mean which quarterback delivered the most when it counted, that is, when the game was on the line. For my definition of “clutch”, I will look at who passed for more touchdowns in the all important and closing 4th quarter. And also who passed for more touchdowns in the 4th quarter with less then 5 minutes remaining in the game. With the pressure on, game clock is dwindling, which quarterback reigns supreme?
To perform this analysis, I needed stats that broke down at the play-by-play level. Most NFL stat sites only give stats at an end of game level (i.e. final box score). I found such play-by-play data at www.advancednflstats.com.

As you can see from the shot above, the data is stored at the individual play level, tracking what quarter, what time, and a description of the play. The website had data going back to 2002. Tom Brady’s rookie season was in 2000 and Eli Manning’s was 2007. So there was enough data to cover the majority of both quarterback’s careers. In fact there was well over 384,000 plays performed in the entire regular season NFL games, dating back to 2002.
Using the trial version of Datameer, I loaded this data and performed some aggregations. Didn’t need to write any code, simply used the wizards to guide me through my analysis.
My first step was to import or ingest the data. Data from the site came in multiple .csv files, one for each year dating back to 2002.


After specifying the data details, Datameer was able to parse through the data and recognize all the column headers and data types.
Once the data had been imported, I opened up a workbook and linked my play-by-play data into a worksheet.
Since this data represented all plays in the NFL for all teams, I used the filter wizard to get only the records for plays by Tom Brady and Eli Manning.
Since the play-by-play data contained descriptions of the play, I applied another filter to find all the touchdown passing play descriptions, but also weeded out interceptions, reversed calls, and non plays. And since we’re looking for stats “in the clutch”, I’m only concerned with the 4th Quarter.
Next I created new columns to flag if the record was a play for Brady or Manning. Since all the details are in the play’s description field, I needed to use a “contains” function to check which quarterback the play was for. By simply double clicking into an empty column I was able to launch the wizard to help me configure the “contains” clause on the description field. I created two new columns, one for Brady and one for Manning.
Now that I have a record for every play and flag for both Brady and Manning, I could now create some analysis. By creating a new sheet and using the GROUPBY function, I grouped my data by the yearly football season.

I then performed a group count on my Brady and Manning boolean flags. One column for Brady and one for Manning.
I saved this workbook and ran it against the entire dataset.
You can now see the results, number of passing touchdowns for Brady and Manning, only in the 4th quarter. A quick plot onto Datameer’s dashboard shows me the following graphs: red lines for Brady’s stats and blue lines for Manning’s.
Then by following the same process above, but now filtering for touchdowns in the 4th quarter and less than 5 minutes remaining in the game, I get the following graphs:
This tells us that Eli Manning is actually better “in the clutch” compared to Tom Brady. Going back a couple years you will see that Manning has, for the most part, matched Brady’s stats. But this is all about “now” and coming into 2011, Manning scored 15 touchdowns in the 4th quarter and 10 with only 5 minutes remaining in the game. Brady’s numbers for the same year are 12 TD’s in the 4th quarter and 7 TD’s with only 5 minutes remaining in the game. So when the game is on the line and time is running out… Eli Manning is your elite QB!
While this data set is small, it shows how easily one can analyze data using Datameer. And since Datameer runs on Hadoop, we could easily scale up to billions of records.
So go ahead, download our trial version of Datameer and see what interesting stats you can come up for the Super Bowl. And don’t hesitate to send us your results, who knows, it might be our next blog post!
http://datameer.com/products/download-trial.html