Sunday, August 9, 2015

Equibase to partner with STATS: What will this mean, if anything, to horseplayers?

One of the bigger pieces of news from this weekend's 63rd annual Round Table Conference on Matters Pertaining to Racing in Saratoga Springs, N.Y., was the announcement that Equibase, "The Supreme Gatekeeper of All Racing Data", and STATS LLC, a data and content company with ties to the NFL, MLB, NBA, etc., entered into a "strategic partnership to develop products and services for horesplayers." (H/T The Paulick Report). While this partnership sounds exciting, especially if it might lead to a freeing of the ridiculously guarded racing data, color me skeptical as to whether this will actually improve the lives of horseplayers.

If I were to ever compose a manifesto of what I would do if I was Supreme Lord of Horse Racing, right up at the very beginning of my absurdly long document would be a topic labeled simply "Unshackle The Results."

Horse racing, a sport depends greatly on gambling dollars from the every day players, keeps the lifeblood handicapping - raw data- locked away from handicappers behind over-priced "plans" and "secured" PDF files. And by "data", I'm not talking about speed figures, pace figures, or any other proprietary creations. I'm talking about the results of races on a day to day basis - the stuff you used to see in your newspaper on a daily basis and now are PDF files on DRF, Equibase or Bris. The very thing that horse players need in order to wager (and, in my opinion, wager more frequently), is priced at such a high level that only the very, very serious can dabble in large data analysis.

Access to data isn't a problem if you work for the Daily Racing Form, or Bris, or Equibase, but the overwhelming majority of horseplayers aren't that lucky. Instead, we're presented with options for common-delineated files at monthly rates that do nothing but cut into our wagering dollars.

I know, I know: "but Matt, these data companies need to make money off all the work that goes into procuring this data!" I suppose that might be true, although it makes you wonder why larger sports are able to give it away for free. Case in point, Major League Baseball.

Here's a link to the main Baseball Reference page for the 2015 Seattle Mariners (which is, sadly, my hometown team).  Scroll down to the player totals for all the "hitters" on this year's team (and I use the term "hitters" lightly), and you'll see some options at the top of the table, one of which is "Export." I click on that button and, in less than 10 seconds, the hitting stats for every single player on this year's team are transported into a nice, easy to analyze Excel file. And I didn't have to pay a frickin' dime!

The fun doesn't just end with yearly totals as I can download box scores, schedules and results, and much, much more, all with the press of a button and all for zero dollars. I use this data for fun; to follow a sport that I love and do some extra analysis from time to time.

Want more examples? I got more.

Go to Pro Football Reference. Or Pro Basketball. Or College Football. Or College Basketball. You can download similar data at all those sites.

Go to Click on "Data Downloads." Those are the game logs (box scores) organized by year for every baseball season since 1871. Yeah, it's free. [This is the source that Baseball Reference uses for much of its data.]

Go to In less than a minute I can download every individual season for every individual player in baseball since 1871 into Excel. Batting. Pitching. Fielding. You name the data, it's in the database. And then I can analyze it in any way that I want.

No other major sport requires its fans to pay for results data. None. Not MLB. Not the NFL. Not the NBA. None.

If I want to download the yearly player stats for every single New York Yankees season I can do so easily and without cost. If I want the lifetime result charts for a single horse I get to pony up $10 to DRF. TEN DOLLARS! For the historical record of a single horse. Or let's say I just want to download the data from the results charts? Oh, it gets even better!

Over at DRF, if you want one year of comprehensive result charts (so, every track for one year), you pay $799. But that's a deal since buying each month takes $99 from your pocket every 30 days.  Bris and Equibase pretty much run that same show: you can buy your data but you'll pay through the nose for it. Or I could do what I do right now: download the PDF charts for free and either hard-key in the data into Excel, or find some computer science student down at UW and pay him to write a data-scraping program so I don't have to waste hundreds of hours hunched over my laptop.

And the sport wonders why people aren't flocking to it to wager their money?

So let's recap, shall we? The major sports in this country, sports with an explosion of interest in fantasy sports* - especially football and baseball - give away their data for free. Horse racing, a sport that can't exist without people putting money through the windows, charges outrageous prices for even the most basic data and "gives" us PDF charts and other user-unfriendly formats.

*And that explosion of baseball data analysis driven by the SABR guys, creations like WAR, Run Created, OPS+, xFIP, park factors, etc., etc., etc. - do you think that happens if MLB held on to their data like horse racing does theirs? Not in a million years.

Data for every single season and  player in baseball history: FREE

Data for every single race at every track for one calendar year: $799

Are we sensing a problem yet?

If I were God of Horse Racing for a day, the first thing I'd do is unshackle the results of races from the few and open it up to the all. If people want to take result chart data and create PPs that they charge people for? No problem at all. But under no circumstances should anyone have to pay to access result chart information in a Excel format. That nonsense would be over and done. Of course, we all know that this will never be the case.

Given all of this data goodness (or, really, badness), forgive me if I'm skeptical of the Equibase-STATS partnership. Based on the current lay of land, I'm pretty sure we'll be presented with some dressed up data packages that still require horse players to shell out good money to access what that fan of any other sport can access for free: results. And quotes like this one all but cement that notion in my mind:

"The real focus is to customize the tools to allow our data scientists to build projections on how a race will be run and weight different values, but also to give the horseplayer control," he said. "Veteran horseplayers all believe they have certain theories and views that they've build over time. If you build a system that allows them to apply their theories to a base that is very strong, then you have a tool that really takes things to the next level." (emphasis added)


If you want to give the horseplayer control, do what every other major sport has done and free the results data. I don't need a company to design projections or build me handicapping programs. I just want to access results charts in a user-friendly format without paying through the nose. But, as usual, the industry just doesn't get it. At all.