Race Data Management

The Challenge

Every day there are races happening all over the world at different times and people place wagers on all of them, and our client has a business that facilitates placing wagers and predicting the winners.

The challenge our client faced was keeping track of when betting is online for particular tracks as you cannot bet on every race whenever you want, you can only place wagers at appointed times. Furthermore, these times may move up or back during the day depending on the efficiency of the track.

Currently our database has over 920M rows of data, so we also had create a way to quickly analyze a large data set and dynamically manage it to keep up with the ebbs and flows of usage.

The Solution

To make wagering easier, and to generate automatic updates on races we built a bi-directional communication system that automatically checks the authoritative data sources for the racing events every 15-30 seconds and allows our client to place wagers at the appropriate time.

Once we have received the data, we store it in AWS data centers and use Terraform to scale up or down our cloud computing needs automatically.

Our system is constantly grabbing an updating data in our operating database (PostgreSQL) and structuring the data in a consistent way so that our client can see they always have the most recent data.

The real ultimate goal for our client is to predict the winner of races more accurately. They've built tools that allow them to analyze how the heats, how tracks, and races are actually going to play out.

Our part is to move all the race data from the operations table into our ETL (extract transform load) tool for analytics. Managing their data in an analytics database rather than a operations database allows for quicker analysis using less computing resources. From there our client applies the algorithms they developed in order to produce educated guesses on the race's outcome.

The result is that our client has a steady flow of data they need for their wagering operation, the computing power of the servers spins up and down depending on demand, and the reporting is RAPID due to utilizing modern ETL and analytics databases.