Case Study

SmartNews

SmartNews leverages Cilium to improve performance and scaling

Challenge

SmartNews is a global news aggregation application originating from Japan. They receive news from different publishers and, with the help of machine learning algorithms, rank the news and recommend the most interesting articles based on users’ historical viewing data. 

SmartNews initially developed its Kubernetes platform on AWS (Amazon Web Services), utilizing kube-proxy for network communications. However, as their platform scaled and required more nodes and clusters, they encountered two major issues: kube-proxy was unable to efficiently handle the increased load, and the cost of the AWS’s load balancer became increasingly expensive.

In response to these challenges, SmartNews began searching for an alternative tool to replace kube-proxy. They needed a solution that would not only manage the network load more effectively but also enable seamless communication across multiple Kubernetes clusters.

Solution

After considering various options, SmartNews selected Cilium for its ability to replace kube-proxy through the kube-proxy replacement (KPR) feature and its use of eBPF. Additionally, Cilium’s Cluster Mesh feature fulfilled their need for cross-cluster communication, which has now become a crucial part of their infrastructure.

With Cilium, they now have improved performance and enhanced scalability and security for their growing number of Kubernetes clusters.

Impact

Cilium has become the go-to solution for replacing kube-proxy in SmartNews’ new Kubernetes clusters. It boosted their network performance and offered a cost-efficient solution during peak loads. Overall, Cilium has significantly improved the team’s capability to offer a secure and high-performance Kubernetes platform, enhancing user experience.

Challenges:
Industry:
Location:
Cloud Type:
Published:
March 30, 2024

Projects used

Argo
Cilium
Kubernetes

By the numbers

39

EKS Clusters

2140

Nodes

3

Regions

Scaling Network Performance and Cutting Costs with Cilium

SmartNews is a multi-platform news aggregation service that employs machine learning algorithms to curate and deliver personalized news content to its users. It gathers top news articles from various publishers and provides a streamlined reading experience through its user-friendly app interface. Launched in Japan in 2012, SmartNews has since expanded internationally, aiming to be a leading source of timely, trustworthy news accessible to audiences worldwide.

Their platform team, made up of six members, manages Kubernetes clusters across regions in Tokyo, Virginia, and Oregon. Each region has multiple AWS EKS clusters, automated with ArgoCD and Jsonnet for deployments, and running online services, machine learning algorithms, and other data processing jobs. Initially, they used kube-proxy for networking but soon encountered scalability and performance challenges.

Recognizing these challenges, they started looking for a mature, rapidly evolving solution that could replace kube-proxy and enable cross-cluster communication.

“We had two pain points and the first one was that we have a cluster with thousands of nodes and in this kind of size of cluster, kube-proxy becomes very hard to use because the overhead is huge. We provision pods and delete pods all the time putting the Kubernetes API under heavy load and consuming up to one CPU per node. 

Based on our requirement for workload isolation, we also need to provision a lot more clusters and the AWS load balancer is not cheap. With these problems, we started looking for something that could replace kube-proxy and provide cheap, elegant cross-cluster communication”

Simon Wu, Software Engineer, SmartNews

SmartNews conducted a thorough evaluation of various solutions, including Istio, Calico, and Cilium. They ultimately chose Cilium because it met their specific needs: it replaced kube-proxy, utilized eBPF for advanced networking at low overhead, provided cross-cluster communication through Cluster Mesh, and offered comprehensive learning resources

“We did look at Istio, client-side load balancing, and the AWS load balancer. After the comparison, we chose Cilium because it’s sidecar free for lower resource consumption, easy to configure global services, and cheaper than AWS LB creating a very elegant solution for us. I think Cilium also has a good community, the learning resources are abundant, and it has good documentation. We see Cilium evolving day by day and we have confidence that if we have some breaking issues, we can turn to the community for help.”

Simon Wu, Software Engineer, SmartNews

“Another important factor is that Cilium is using eBPF and I believe that eBPF is the future”

Luke, Infrastructure Team Lead, SmartNews

After selecting Cilium, SmartNews provisioned new clusters with it, configured Cluster Mesh in each region, and then transitioned their services into these clusters. This approach allowed them to strike a balance between new and old technologies, providing an opportunity to test services in the new cluster without affecting their existing online services. During the migration to Cilium, they also chose to switch from IPv4 to IPv6. 

“The new cilium agent consumes much less resources because of eBPF. Also, the Cilium global service is working well for cross-cluster communication and we’re able to reduce the cost of our AWS load balancers now.”

Simon Wu, Software Engineer, SmartNews

Addressing Infrastructure Demands with Cilium

The migration to Cilium has proven successful for the SmartNews infrastructure team, with 90% of their clusters in the development environment successfully transitioned. This move has effectively resolved their scalability and kube-proxy challenges within their Kubernetes platform. Additionally, they’ve achieved cost savings on their cloud provider’s load balancing by adopting Cilium.

Since Cilium has successfully addressed many of their performance and cost concerns, SmartNews is exploring further uses for Cilium’s extensive feature set. They are currently evaluating Hubble for deep observability into their network flow.

“We are evaluating Hubble to monitor our network traffic flows and try to identify issues early.

We’re excited to see what Cilium will bring to us next! There’s a lot of interesting things in the future and we are keeping our eyes open.”

Simon Wu, Software Engineer, SmartNews