In May, we released Golazo into beta and we got to experience that wonderful feeling when a big idea sheds its ethereal form and becomes physical. Ok, digital. Either way, it was euphoric. Especially because we had overwhelming positive feedback from our fans.
May also marked my one year anniversary with Major League Soccer. When I first came to MLS, the team had been running full bore for almost two years. The initial launch of 19 sites (we manage the club and league sites) is a tall task for any team and it had kept our small team of devs very busy. But by spring of 2012, the team had started to think beyond the standard sports CMS product and had begun working on some concepts for a brand new second screen experience focused around live matches–MatchCenter 2.0.
The idea was basically this: ingest stats, news, social content, chat, photos, videos, and live streams and create a single place for our fans to park their computers or tablets or phones during a match that they were already watching on television or MLS Live. The goal was to create the best second screen experience in sports. Not just soccer, all sports. A goal we are definitely still working towards.
There were some static mockups of the new matchcenter, but from a technical standpoint, it was totally greenfield. We knew we wanted to take advantage of websockets to make the app feel as real-time as possible and having spent the last several years playing around with Node.js and Backbone to build real-time-ish web apps, it seemed like an easy decision. The team had recently, ahem, contracted in size–by then there were only two of us: myself and Hans. Hans was interested in learning Node and I was excited to build something big with it, so away we went.
Every good project needs a good code name. We have several native spanish speakers in the office here and when we are all watching an exciting match (at MLS we like to watch soccer all day, every day), they would often yell out “Golazo!”, which means something like great goal. The name just stuck, though unfortunately, the idea of using spanish soccer terms for all our code words did not, and after a failed attempt with “Bicicleta” we switched to Greek Titans. But I digress.
Golazo’s architecture was designed with three goals in mind: real-time, horizontal scaling, and fault tolerance.
- Real-time - I hate to use the word real-time, because some embedded engineer somewhere is going to send me hate mail about what real real-time really is. The point of Golazo is to minimize the time it takes to get data from our backend servers to the screens of our fans.
- Horizontal scaling - Life at MLS wasn’t always soccer and unicorns. We had some major issues scaling up our web infrastructure. The scars from those battles have made sure we always design scalability in from the beginning. We wanted to ensure that scaling would be a matter of adding more servers (horizontal scaling), instead of bigger servers (vertical scaling). This aspect of the design is always the most challenging, because it tends to force a distributed architecture (along with all the corresponding trade-offs).
- Fault tolerance - Soccer is important! ‘nuff said.
The core idea
The core idea of Golazo is this: By using a stateful, event driving programming model on the server, and allowing clients (web browsers) to subscribe to these events over websockets, we are able to push an event from the server to the client. This is in contrast to a more traditional polling model where the client periodically checks the server for new events.
To accomplish this, we built a set of separate back-end services in Node.js and a front-end client using Backbone.js. By overriding Backbone’s sync function to use websockets, we are able to push an event (such as a new tweet) into the browser and let Backbone’s model-view-controller system do the rest.
At its core, Golazo is a set of distributed and loosely coupled services that communicate via pubsub. Services fall into two general categories: publishers and consumers. A publisher is any service that pushes data into Golazo. We have publishers for Twitter, Instagram, statistics, APIs, videos, and several more. Consumers are services that push Golazo’s data out to connected clients (browsers).
Publishers, by design, don’t have to scale out. We only need to consume a tweet once to ingest it in our system. Social content is curated by our staff, so even for a popular match Golazo won’t have to handle more than a few thousand tweets an hour (any more than this and curation becomes unwieldy). Each publisher is independent of the others, so if one service crashes or is otherwise non-functional, the others can continue pushing content into Golazo. This lets the software degrade as gracefully as possible when external data sources become unavailable.
Consumers do exactly as they are named. They consume the events broadcast across the pubsub channels and pass them along to any connected clients. Websockets require a stateful, persistent back-end connection, and the main job of the consumer is to maintain these connections and route data to the appropriate clients. This design makes it easy to scale out the part of Golazo that is exposed to real variation in load. Take tweets for example. A popular match may have tens of thousands of connected clients, but a tweet is received once by the Twitter publisher, published once to the pubsub channel, and received once by each active consumer node. The consumers can then transmit the resulting message to their connected clients without ever retrieving the data from our data store.
If load or latency becomes too high on the consumer pool, we can just spin up more. Each new consumer adds minimal strain to the existing infrastructure and can handle (in theory) thousands more clients.
Pubsub and data stores
I have referred to pubsub several times. We are using Redis for pubsub because it is easy to set up and we also get to use it as a high performance store for certain operations like generating sequence identifiers and capturing published events for playback (a topic for another post, perhaps).
For long-term storage, we are using Couchbase. All the data published by the various publishers is persisted to our Couchbase cluster before it is broadcast out to the other services. When the client connects to a completed or in-progress match, they get the current state up to that point in time from Couchbase.
Couchbase is pretty young (most people confuse it with CouchDB!). The decision to use Couchbase was driven by a desire to keep all levels of our stack horizontally scalable while avoiding the pain of managing data sharding (and corresponding infrastructure complexity) ourselves. I also felt pretty strongly that I wanted to avoid any sort of master/slave replication as well, given how much of my previous job had been spent babysitting SQL and MySQL database replication.
Regarding sharding: It is not that we don’t believe in sharding, but rather that our team was two people at the time and we were trying to get an MVP out the door. We can’t all be Facebook on day 1.
One of our partners had been using Couchbase in production for some time and convinced us it was worth a try. It fit the bill in terms of managing data sharding for you, making it easy to add new nodes, and it has a truly fantastic web interface to boot. So far it has been everything it was promised to be. Easy to manage and easy to use.
End to End
The final result is an app that creates the following experience:
- Client: client connects to a Golazo match.
- Server: Golazo renders (in html on the server) the current state of the match and sends back to the client.
- Client: the client displays the current match state and subscribes to future updates to the match via websocket.
- Some time passes.
- Server: a tweet arrives at a Golazo twitter manager and is ingested and persisted in Couchbase.
- Server: a redis pubsub event with the tweet content is broadcast to the connected consumers.
- Server: the consumers process the tweet event and relay via websocket it to any subscribed clients for that match.
- Client: the client receives the tweet event over the websocket and creates a new tweet model and adds it to its collection of game feed events.
- Client: Backbone does its magic and draws the tweet on the top of the game feed.
Where do we go from here?
We are on the cusp of shipping Golazo v1. Even though there is a ton more to do, we need to get it out into the wild and get some real-world miles on the software to make sure we are really headed the right direction for our fans. So the question becomes what’s next?
Golazo was conceived as a web app, but more and more of our fans are accessing MLSsoccer.com over mobile apps and connected devices. In order to make Golazo as accessible as possible, we want to build a real-time API (think Twitter) for our apps and devices to use. Golazo itself will be refactored to become a client of the API. You know, eat your own dog food.
Fit and Finish
Golazo is certainly a v1. We are well aware of our design and usability shortcomings and we plan to fix them. We are currently hiring a UX Designer to help us kick this process into high gear.
We love stats. If left to our own devices, we would just crank out crazy visualizations all day. There is an infinite amount of improvement we can do here. We will remain focused on creating stats that help non-experts understand the game better.
Another big goal for us (from a technical standpoint) is to make sure that Golazo can survive hurricanes, earthquakes, floods, and nuclear strikes! To do this, we intend to run Golazo in multiple data centers and route requests to the lowest latency location.
Golazo is the first iteration of what we hope becomes one of our flagship products. The team has expanded and we are making a big investment into pushing boundaries for soccer and the web. It has also begun to change the way we build software here at MLS Digital. Our new apps and products are moving toward a services model that Golazo helped to forge. We have a great team, great ideas, and I am very excited to see where we can go in the next couple of years.
Author: Justin Slattery (@jdslatts)