Case Study

Bloomberg

Bloomberg: An early adopter’s success with Kubernetes at scale

Challenge

Founded in 1981, Bloomberg runs one of the largest private networks in the world. Every day, the company deals with hundreds of billions of pieces of data, and there are 14,000 different applications on the Terminal alone. In recent years, the infrastructure team has worked on delivering infrastructure as a service while spinning up a lot of VMs and scaling as needed. “But that did not give app teams enough flexibility for their application development, especially when they needed to scale out faster than the actual requests that are coming in,” says Andrey Rybka, Head of the Compute Architecture Team in Bloomberg’s Office of the CTO. “We needed to deploy things in a uniform way across the entire network, and we wanted our private cloud to be as easy to use as the public cloud.”

Solution

After evaluating multiple frameworks in 2016, the team decided to adopt Kubernetes, even though it was then still in alpha. “Kubernetes gave us more flexibility than the other frameworks, and allowed us to scale out solutions much faster than if we were to spin up VMs,” says Rybka. “The community was growing very rapidly, and the features we saw were aligned with what we were looking for.”

Impact

Bloomberg has seen many benefits from Kubernetes: more productive developers, fewer errors when applications are being deployed, improved resiliency of services, automation without much effort, repeatability of deployment, and improved resource management. “With Kubernetes, we’re able to very efficiently use our hardware to the point where we can get close to 90 to 95% utilization rates,” says Rybka. 

Industry:
Location:
Cloud Type:
Product Type:
Published:
September 12, 2019

Projects used

Fluentd
Kubernetes
Prometheus

By the numbers

Time to market

Could be minutes

Hardware utilization

Gets close to 90-95% with overprovisioning in Kubernetes

Hundreds of Kubernetes clusters, thousands of nodes each

Founded by Michael Bloomberg in 1981, Bloomberg L.P. lives at the nexus of finance, technology, and media. With its Terminal, the company runs one of the largest private networks in the world.

Every day it deals with hundreds of billions of pieces of data coming in from the financial markets and millions of news stories from hundreds of thousands of sources. There are 14,000 different applications on the Terminal alone. 

At that scale, delivering information across the globe with high reliability and low latency has proven to be a big challenge for the company’s more than 5,500-person strong Engineering department. Bloomberg has embraced a service-oriented architecture since the 1980s, and in recent years, the infrastructure team has been working on delivering infrastructure as a service while spinning up a lot of VMs and scaling as needed. “But that did not give app teams enough flexibility for their application development, especially when they needed to scale out faster than the actual requests that are coming in,” says Andrey Rybka, Head of the Compute Architecture Team in Bloomberg’s Office of the CTO. “We needed to deploy things in a uniform way across the entire network, and we wanted our private cloud to become as easy to use as the public cloud.”

A few years ago, Bloomberg engineers started writing containerization software and orchestration software to support the deployment of about 6,000 instances of Apache Solr, an open source enterprise search platform, across roughly 1,000 servers. But they soon realized there must be better solutions out there.

A small team within the CTO’s Office began evaluating multiple frameworks at the beginning of 2016. Kubernetes stood out, even though it was then still in alpha. “Kubernetes gave us more flexibility than the other frameworks, and allowed us to scale solutions much faster than if we were to spin up VMs,” says Rybka. “The community was growing very rapidly, and the features we saw were aligned with what we were looking for.”

That bet has paid off: “We could look at the plan for future Kubernetes releases, and if there was functionality that we wanted that didn’t exist yet, there were plans for it. And then they got delivered,” says Kevin Fleming, Head of Open Source Community Engagement in Bloomberg’s Office of the CTO. 

Pilots were conducted in both private and public cloud deployments, and at the same time, engineering management tasked Bloomberg’s Developer Experience (DevX) team with ensuring that there was rock-solid continuous integration in place. “They realized we needed end-to-end testing and reporting and all those sorts of things so that the deployment of thousands of microservices, instead of dozens of large applications, would actually be a tractable problem,” says Fleming.

Having that foundation made the migration a smooth and successful process. In early 2017, the company began running its first production software on Kubernetes. Since then, it has seen many benefits from Kubernetes: more productive developers, fewer errors when applications are being deployed, improved resiliency of services, automation without much effort, repeatability of deployment, and improved resource management. “With Kubernetes, we’re able to very efficiently use  our hardware to the point where we can get close to 90 to 95% utilization rates,” says Rybka. 

Autoscaling in Kubernetes also allows the system to meet demands much faster, Rybka adds: “You don’t have to ask for more servers. If you know it’s market opening, 9:30 a.m. Eastern Time, and you suddenly need more CPU and memory, you just say, ‘I need two minimum instances of my service, but it might go to 12.’ And you don’t even need to specify the time when you need it. As long as the metric that you are trying to autoscale on is accurately tracking the CPU usage increasing, for example, then you’re going to get more instances.”

“A lot of our application teams are making the leap directly from monolithic applications on big iron machines to stateless microservices running in cloud native infrastructure.”

— STEVEN BOWER, DATA AND ANALYTICS INFRASTRUCTURE LEAD AT BLOOMBERG

Furthermore, “Kubernetes has offered us the ability to standardize our approach to how we build and manage services, which means that we can spend more time focused on actually working on the open source tools that we support,” says Steven Bower, Data and Analytics Infrastructure Lead. “If we want to stand up a new cluster in another location in the world, it’s really very straightforward to do that. Everything is all just code. Configuration is code.”

As a result, “developers now essentially just compose their infrastructure from the existing building blocks,” says Rybka. “We have quite a bit of capability that you normally find in public cloud on premises. You want the CI/CD pipeline? No problem. We simplified quite a bit, and time to market for new services could be minutes if you really wanted to speed it up.”

Bower points out that “a lot of our application teams are making the leap directly from monolithic applications on big iron machines to stateless microservices running in cloud native infrastructure”—which is a remarkable cultural shift. 

In fact, Rybka reports that he has “an avalanche” of people who want access to Kubernetes, leading to something of a waitlist. For one thing, there’s more demand than resources available, but the infrastructure team is also taking a cautious, gradual approach. “We want to make sure that everything is written in a 12-factor and cloud native way, and that failure and resiliency are built in,” he says. “We want to really provide stable and safe migration paths.”

“I could look at the planning for future Kubernetes releases, and if there was functionality that we wanted that didn’t exist yet, there were plans for it. And then they got delivered.”

— KEVIN FLEMING, HEAD OF OPEN SOURCE COMMUNITY ENGAGEMENT AT BLOOMBERG’S OFFICE OF THE CTO

Another focus for the team is to work on building and improving open source tooling. So far, they’ve contributed to Kubeflow and Knative, and have released some of their own tools, which are gaining traction: Goldpinger for visualizing the network map and PowerfulSeal for chaos testing on Kubernetes clusters. They’ve also adopted other CNCF projects, such as Prometheus and Fluentd, and one of their next big goals is to improve telemetry with technologies like Jaeger.

The Kubernetes journey has been an impactful one for Bloomberg, and the team members offer both encouragement and caution for others considering following suit. “Kubernetes and all these tools make things easier, but they don’t make them simpler,” says Fleming. “Networking is still networking, and you still have to understand it. And that leads to the second part which is: Hire smart people.” (Bloomberg is hiring too.)

Plus, says Bower, “moving to Kubernetes is only a small piece of the puzzle. What you’re talking about is moving to cloud native application development, deployment, and maintenance. So you still have to have all the rest of the stuff in place to enable people to do continuous integration and deployment, build container images and deploy them.”

For Rybka, the path from early adopter to expert has been an eye-opening one. “We’ve been to every KubeCon at this point,” he says. “I would never have predicted how fast this entire ecosystem has grown, and the architecture of the project doesn’t seem to suffer from all these different teams collaborating, right? It’s really nice to see how quickly things have matured.”