This is Part 1 of a multi-part series. I don't know how many parts... as many as I come up with.
Over the past year I’ve been working on a graded stakes database and a system for rating horses and the strength of individual races. It’s been a bear of a project because it involves a crapload of manual data entry, along with constant/weekly updates since the ratings change as more data is added to the population. The database includes the top three finishers from every graded stakes race in North America since January of 2013, along with the field size, finish margins, an assortment of speed and pace figures, and other assorted ratings.
Piggybacking on my post from the other day, speed figures, pace figures, and ratings capture different elements of the performance of a horse in a specific race. A speed figure highlights the final time given the relative speed of the track; a pace figure describes the shape of the race, a rating incorporating weight views the performance of the horse in relation to the other horses in the race, and on and on and on. I like to look at all of those factors but wanted to try and come up with a way to represent performances across the entire spectrum of stakes race. Additionally, I wanted a rating that would change over time given the relative strength or weakness of subsequent races. Those goals led me to this database project.