Guest post originally published on the Magalix blog by Mohamed Ahmed

What Is OPA?

It’s a project that started in 2016 aimed at unifying policy enforcement across different technologies and systems. Today, OPA is used by giant players within the tech industry. For example, Netflix uses OPA to control access to its internal API resources. Chef uses it to provide IAM capabilities in their end-user products. In addition, many other companies like Cloudflare, Pinterest, and others use OPA to enforce policies on their platforms (like Kubernetes clusters). Currently, OPA is part of CNCF as an incubating project.

What Does OPA Bring To The Table?

You may be wondering: How did OPA come about? What problems does it try to solve? Indeed, policy enforcement over APIs and microservices is as old as microservices themselves. There’s never been a production-grade application that didn’t enforce access control, authorization, and policy enforcement of some kind. To understand the role of OPA, consider the following use case: your company sells laptops through an online portal. Like all other similar applications, the portal consists of a front-page where clients see the latest offerings, perhaps some limited-time promotions. If customers want to buy something, they need to log in or create an account. Next, they issue payments through their credit cards or other methods. To make sure your clients repeatedly visit, you offer that they sign up for your newsletter, which may contain special discounts. Also, they may opt to receive browser notifications as soon as new products are announced. A very typical online shopping app, right? Now, let’s depict what that workflow would look like in a diagram to visualize the process:

OPA workflow diagram

The diagram above shows how our system might look internally. We have a number of microservices that communicate with each other to serve our customers. Now, obviously, Bob shouldn’t see any of the internal workings of the system. For example, he can’t view (or even know about) the S3 bucket where payments get archived, or which services the notification API can talk to. But, what about John? He’s one of our application developers and he needs to have access to all the microservices to be able to troubleshoot and debug when issues occur. Or, does he? What if he accidentally (or intentionally) made an API call to the database service to change the delivery address of the customer to somewhere else? Even worse, what if he had read permissions to the customers’ credit card numbers? To address those risks, we place an authorization control on top of each of our microservices. The control checks whether or not the authenticated user has the required privileges to perform the requested operation. Such an authorization system may be an internal, home-grown process or external as provided by AWS IAM. That’s how a typical microservices application is built and secured. But look at the drawbacks of using several assorted authorization systems especially as the application grows:

Let’s look at Kubernetes as an example. If all users were authorized access to the entire cluster, lots of nasty things can happen such as:

You can definitely use RBAC and Pod security policies to impose fine-grained control over the cluster. But again, this will only apply to the cluster. Kubernetes RBAC is of no use except in a Kubernetes cluster.

That’s where Open Policy Agent (OPA) comes into play. OPA was introduced to create a unified method of enforcing security policy in the stack.

How Does OPA Work?

Earlier, we explored the policy-enforcement strategies and what OPA tries to solve – that showed us the “what” part. Now, let’s now take a look at the “how.”

Let’s say that you’re implementing the Payments service of our example application. This service is responsible for handling customer payments. It exposes an API where it accepts payment from the customer. It also allows the user to query which payments were made by a specific customer. So, to obtain an array containing the purchases done by Jane, who is one of the company’s customers, you send a GET request to the API with the path /payment/jane. You provide your credential information in the Authorization header and send the request. The response would be a JSON array with the data you requested. However, since you don’t want just anyone with network access to have access to the Payments API to see such sensitive data, you need to enforce an authorization policy. OPA addresses the issue in the following way:

  1. The Payments API queries OPA for a decision. It accompanies this query with some attributes like the HTTP method used in the request, the path, the user, and so on.
  2. OPA validates those attributes against data already provided to it.
  3. After validation, OPA sends a decision to the requesting API with either allow or deny.

The important thing to notice here is that OPA decouples our policy decision from policy enforcement. The OPA workflow can be depicted in the following diagram:

OPA workflow diagram

OPA is a general-purpose, domain-agnostic policy enforcement tool. It can be integrated with APIs, the Linux SSH daemon, an object store like CEPH, etc. OPA designers purposefully avoided basing it on any other project. Accordingly, the policy query and decision do not follow a specific format. That is, you can use any valid JSON data as request attributes as long as it provides the required data. Similarly, the policy decision coming from OPA can also be any valid JSON data. You choose what gets input and what gets output. For example, you can opt to have OPA return a True or False JSON object, a number, a string, or even a complex data object.

OPA Internals

To fully understand OPA and start implementing it in your own projects, you must familiarize yourself with its features and components. Let’s start with how you define your policies.

Policy Language: Rego

Rego is a high-level declarative language that was built specifically for OPA. It makes it very easy to define policies and address questions like: is Bob allowed to perform a GET request on /api/v1/products? Which records is he actually allowed to view?

Deployment

When it comes to deploying OPA, you have more than one option depending on your specific scenario:

Introducing Policy as Code- the Open Policy Agent (OPA) 3

How To Manage And Control OPA?

To further reduce latency, the designers decided that OPA should keep all the policy data in memory. This ensures that OPA is not going to query another service to request the data. To deal with OPA, you have a set of APIs that serve different purposes:

Your First OPA Policy

By now you should have a pretty clear picture why OPA came to existence, the problems it tries to solve, and the way it was designed and managed. It’s time to test the waters and see what it’s like to create a policy in the Rego language. The first step is to define your policy in plain English. For example:

“Customers should be able to view their own payments. Financial department staff should be able to view any customer payment.”

The next step is to convert the policy to the Rego code. We can use the Rego playground for this. So, in the main panel, clear the code that was already added there and add the following:

package play

# Customers should be able to view their own payments
allow = true {
	input.method = "GET"
   input.path = ["payments", customer_id]
   input.user = customer_id
}

Let’s review this snippet line by line:

  1. Any lines that start with the hash sign (#) are comments. It’s always a good practice to write what your policy is supposed to do as a coherent, human-readable comment.
  2. allow = true means that the decision would be allow if the following ‘evaluations’ are true.
  3. The input method is GET. Any other HTTP method (POST, PUT, etc.) will violate the policy.
  4. The path is /payments/customer_id. Notice that the customer_id is not quoted, which means that it’s a variable that needs to be substituted at invocation time.
  5. The user should also be the customer_id.

If we were to translate this code back to plain English, it’d look something like:

“Allow the request if the method it uses is GET, the path is /payments/customer_id, and the user is the same customer_id. Which effectively allows a customer to view her own payment data.”

The Rego playground also allows you to evaluate your code and make sure that the policy will work as expected. In the INPUT panel, we can fake a legitimate request by adding the following code:

{
  "method": "GET",
  "path": ["payments","bob"],
  "user": "bob"
}

Notice that the INPUT is using arbitrary JSON. There are no specific rules to follow when supplying the request. Now, let’s see how OPA would reply in response to this decision request by pressing the Evaluate button. The OUTPUT panel should display something as follows:

{
  "allow": true
}

Below is a screenshot of the playground after performing the above steps:

Introducing Policy as Code: the Open Policy Agent (OPA)

Now, let’s try and change the user in the request to be alice, which entails that a customer is trying to view the payments of another customer. If we press Evaluate you will notice that the output displays an empty JSON object {}. The reason for that, OPA doesn’t know what to send when the policy doesn’t match. To change this behavior, add the following statement before the body of the policy:

default allow = false

So, the whole policy should look like this:

package play

# Customers should be able to view their own payments
default allow = false
allow = true {
	input.method = "GET"
   input.path = ["payments", customer_id]
   input.user = customer_id
}

Now, if you press Evaluate you’ll see the expected output:

{
  "allow": false
}

Notice that the playground is so powerful that you can select parts of the policy and evaluate it independently from the rest of the policy. This can be super useful when you have a complex policy that evaluates to false when it shouldn’t. In that case, you can select portions of the policy and see where exactly the flaw occurs.

Okay, now that we’ve executed the first part of our policy, let’s move on to the second part: the financial department staff should be able to view any customer payment.

Add the following lines after the policy that we defined earlier:

# Financial department staff can view any customer payments

allow = true {
  input.method = "GET"
  input.path = ["payments", customer_id]
  finance[input.user]
}

finance = {"john","mary","peter","vivian"}

Most of this policy is similar to the previous one, except at line 4. Instead of evaluating whether the user ID is the same as the customer ID, we evaluate if the user is part of the finance JSON object. Rego has a lot of built-in constructs that allow you to do many helpful things including lookups. Finally, we define the finance object and add the usernames for the staff who work within that group. In a real-world scenario, this JSON object would be passed as part of the INPUT request or as a token. Now, let’s test the policy by setting the user and the customer to the same name (for example, bob). The policy should return true. Change the user to be john (who is part of the finance department) and test the policy. Again, it should return true. Finally, change the user to be any name that does not work in the finance department (let’s say, jane), and the policy should return false.

You can read more about the Rego language and what you can do with it by referring to the official documentation.

Integrating OPA With Other Systems

As mentioned before, OPA can be integrated with many of today’s platforms. Let’s take a look at a few examples of what OPA can do for you:

Kubernetes:

API Authorization:

Linux PAM:

There are also many other products that can be integrated with OPA to provide endless possibilities. For example, Kafka, ElasticSearch, SQLite, and CEPH, to name a few.

TL;DR

To fast-track your adoption of policy as code with OPA, check out Magalix KubeAdvisor and its simple markdown interface for Open Policy Agent, and try a 14-day free trial.