Service proxy, service mesh or API gateway – which do you need?

Posted on December 20, 2023 by Ahmet Soormally and Carol Cheung

CNCF projects highlighted in this post

Member post originally published on Tyk’s blog by Ahmet Soormally and Carol Cheung

Illustration of service proxy, service mesh and API gateway

The rise of the microservices architecture has brought with it a whole heap of efficiency and flexibility – and some interesting challenges. As organisations have grown the number of services they use, network communications have become increasingly complex.

This has led to a need for a range of tools optimised for solving different problems, including service proxies, service meshes and API gateways.

To the cloud native novice, service proxies, service meshes and API gateways seem to do similar things. However, as always, the devil is in the details. If you’re currently dealing with tens, hundreds or even thousands of apps, you’re undoubtedly aware that network communication is unreliable, takes time and is susceptible to attack. Having replicas of your apps running on different nodes in one or more clusters, with those clusters spread all over the world, covering multiple regions, only adds to the complexity.

This is why the right tools are so important. But which do you need – a service proxy, a service mesh or an API gateway? Let’s walk through the fundamental concepts and approaches to each of these to throw some light on the subject.

A quick word on Kubernetes

Before we dive in, it’s worth pointing out that Kubernetes can solve many of your networking challenges. For instance, a Kubernetes service allows you to reach the correct application, wherever it may be, using a static service name. Ingress allows you to route external traffic to the proper application based on the request header and path. In short, Kubernetes helps deal with the complex web of remote procedure calls that are transmitted over a network.

But many businesses need more than Kubernetes alone. What about load-balancing requests across all the replicas of an application? What if you want mutual TLS or distributed tracing? Or to rate limit requests and guard against web app security vulnerabilities?

Over time, service proxies, service meshes and API gateways were developed to solve these cross-cutting concerns. Importantly, they all solved them without requiring applications to be rewritten. Let’s look at how each of them helps.

Service proxy

A service proxy is a data plane component that intercepts traffic to or from a given service, applies some logic, and then forwards that traffic to another service. It essentially acts as a go-between, collecting information about network traffic and/or applying rules. The service proxy evaluates a client’s request and applies changes to it.

These changes are configured and executed within the service proxy, so there’s no need to modify the application itself. Rather than connecting to the microservice application directly, clients connect to the proxy.

Service proxies are general purpose. You can apply different kinds of checks or modify traffic however you like. Examples include terminating TLS, executing authentication and authorisation checks, rate limiting, caching, load balancing, adding or removing HTTP headers, dynamic routing based on system metrics and supporting blue-green deployments or canary testing strategies.

Sitting between the client and the target service, the proxy can introduce new behaviours within the request sequence. This means it can ensure traffic is routed to the correct destination service or container, apply security policies, and meet application performance needs.

In the Kubernetes world, you can use sidecar proxies to extend and enhance the main container, meaning all requests and responses to and from the service are routed via the proxy. There are several use cases for this. For example, when different services require different proxy setups, it is impossible or impractical to share a single proxy between them. The sidecar proxy design can also reduce the blast radius if one of the proxies is down.

This presents a new challenge. You can individually configure service proxies, but it’s more typical to manage them centrally. However, if the proxies are distributed, how can you configure all of them effectively? This is where orchestrating via a control plane, such as a service mesh or API management solution, comes in.

Service mesh

A service mesh is a dedicated infrastructure layer for managing service-to-service communication within a microservices architecture. The service mesh enables platform teams to add uniform reliability, observability and security features across all services running within a cluster. Handily, it does so without requiring any code changes.

When sidecar proxies are deployed next to applications to route and process all traffic from them, it raises the question of who configures the sidecar proxy to route a specific request to a different service when required. The service mesh control plane can handle this. It can also handle load balancing, circuit breaking, timeout settings, system-wide authentication, authorisation, and setting blue/green and canary deployments. The control plane takes a set of isolated, stateless sidecar proxies and turns them into a distributed system.

Different service mesh products have different philosophies and offer different sets of control plane features. Istio and Linkerd are two examples of this. They enable you to use the control plane layer to provide overall configuration, certificate authority and service discovery for the service proxies.

The main features of service meshes include dynamic service discovery, mTLS and traffic management, separating these concerns from the business logic. With the service mesh providing a single plane of control, you can use it to define security policies, free developers from networking concerns and gain visibility into what’s happening in your cluster.

Service mesh benefits

In short, you can use a service mesh to automate DevSecOps best practices for observability, security and reliability.

On the observability front, you can track the number of requests per second a service has processed, for example, success and failure rates. With automated log aggregation and telemetry collection, this observability functionality provides plenty of scope for faster, more efficient monitoring and troubleshooting.

From a security perspective, service mesh automates mTLS for all service-to-service communications and provides certificate rotation to prevent man-in-the-middle attacks and achieve zero-trust security. Fine-grained traffic governance is possible with different service-level authentication methods. This lets you create role-based access control for each service and restrict communication.

In reliability terms, service meshes automatically handle load balancing, retries, timeouts and circuit breaking, making applications more resilient to service failures and improving overall user experiences.

That’s not to say they are without issues. Even if the data plane proxy is really thin, using the sidecar deployment model still means adding an extra hop to the request and needing more resources to run. It’s also tricky to calculate how much memory and CPU sidecars should get, which can result in costly, resource-intensive over-provisioning. The fact that sidecar containers were not first-class citizens before Kubernetes 1.28 can also cause problems.

API gateway

An API gateway is a tool that aggregates application APIs, making them all available in one place. It allows organisations to move key functions or cross-cutting concerns, such as authentication and authorisation or limiting the number of requests between applications, to a centrally managed data plane. The API gateway is a common interface for (often external) API consumers.

API gateways have plenty to offer when it comes to security. The edge of a system is where users first interact with your applications and typically where hackers first encounter your systems. While most enterprises will have multiple security layers at the edge of their stack – such as a CDN, WAF and dedicated DMZ – the API Gateway may be the first line of defence for smaller organisations.

For some, it will be the only line of defence. A gateway’s TLS termination, IP Allow or Deny lists, TCP routing and even basic WAF capabilities or integrations can all be very handy here. (You could, of course, use a service proxy for this use case too.)

A distinguishing security feature of API gateways is the ability to interoperate with external authorisation servers using standards-based patterns such as OAuth2/OpenID Connect. The gateway can read an authorisation header, introspect tokens, validate JWTs, extract claims such as scopes, map and validate those claims against internal security policies and then route traffic whilst propagating identity to the appropriate upstream.

Note that each API style has its own security requirements, and not all gateways will have native support for your required API style:

For REST, this might include the need to protect API routes by method/path. More advanced gateways will also be able to validate requests and responses by an OpenAPI contract.
For GraphQL, you will need to enable or disable different queries, mutations or subscriptions and even apply granular field-based permissions.
For gRPC-based services, an API gateway would need to be able to handle request authorisation for service and service methods.
For asynchronous APIs, you might need to apply rate limits based on the number of events or data and enable authorisation for which topics can be subscribed.

Endpoint optimisation

A common requirement for API gateways is the need to optimise endpoints. Most transactions will involve calls to multiple microservices; sometimes, they must happen in a predetermined sequence. API gateways can simplify chained or batch calls to expose a single endpoint and return a single, aggregated response to the API client. This results in improved performance – such as reduced latency – and makes it really simple for developers to consume your services.

The API gateway can also provide a public interface for a client, different from the disparate set of polyglot API upstreams. This is known as protocol transformation or mediation. It allows you to keep a consistent public-facing API but swap out the implementation of the underlying microservices without ill effect on your consumers.

Perhaps some of your APIs are legacy services you don’t have the desire or resources to modify to fit your current business requirements. An API gateway can help here, too. Microservice applications can expose all kinds of different format APIs over time. This may or may not be intentional. If your API consumers want to consume GraphQL, your underlying microservices use other formats, such as JSON, protobuf, Apache Avro, or AMQP. The API Gateway is the tool to perform that protocol transformation.

An API gateway is also ideal when breaking up a monolith. It allows you to ensure continued access to clients through the gateway whilst progressively splitting out the monolith over time.

Key API gateway features and functionalities include cross-origin resource sharing, caching, service discovery, mocking, circuit breaking and more. And when you couple that gateway with API management, things get really interesting.

API management

API management isn’t just a control plane for API gateways but – perhaps more importantly – a business enabler. You can use it to deliver significant business value.

Do you remember, back in the day, when developers had to integrate with different services via a rabbit MQ or a REST service, with documentation provided via PDF or a wiki? It meant reading documentation in various formats and translating it into an API client.

How things have changed. With modern API gateways, you can deploy them locally on your laptop or within a local docker container. Coupled with an OpenAPI specification, you can load that into the API gateway and have the gateway automatically mock the responses provided within that specification. What’s more, the gateway can automatically validate payloads, so you can write tests that call the mocked services on the gateway.

For more complex or dynamic mocks, you can create virtual functions to simulate the real world better, shaving hours off development time and assisting with automated testing.

With API management in place, you can underpin business value through discoverability (usually courtesy of some kind of developer portal with a golden path to aid self-service onboarding), version management and in-depth analytics for superior visibility.

Do you need a service proxy, service mesh or API gateway?

So, which do you need: a service proxy, service mesh or API gateway? They are all components that aid network communication in a microservices architecture. However, they all serve different purposes and have distinct features that suit particular use cases:

A service proxy is a general-purpose proxy. If used alone, it is commonly used as a load balancer.
A service mesh is handy for platform engineers, DevOps, and site reliability engineers when they want to apply policies for security and governance purposes uniformly. It also enables them to get detailed insights into how the microservice is performing.
An API gateway speaks more to API developers and API product owners. APIs are crucial business assets that enable innovation and go far beyond ingress. Think of OpenBanking, Stripe, GitHub, Twilio and HubSpot. These businesses have productised APIs strategically to grow significant adoption and usage of their platforms.

Of course, when deciding between service proxy, service mesh and API gateway, it’s not a matter of choosing one or the other; you can use them in combination. Ultimately, it comes down to the outcomes you’re trying to achieve. Do you need zero-trust security in your network? A service mesh can automate that for you!

Would you like to take an API-first approach in development? Then, an API gateway is here to help. Do you want both? Why not! In fact, why not explore how Tyk works with a service mesh right now?

Making decisions like this can be really tough – but the potential rewards are well worth it.

Hyderabad, India