Guest post originally published on Elastisys’s blog by Lars Larsson, Senior Cloud Architect, and DevOps Expert Engineer at Elastisys

CTOs with applications targeting EU citizens know that GDPR in this post-“Schrems II” world makes use of US cloud providers legally difficult. In a nutshell, GDPR states that the personally identifiable information of EU citizens must be protected against disclosures, and there are laws in the US that require precisely such disclosures (FISA with its section 702 and the CLOUD Act). Since US-based cloud providers are bound by these laws, there is a legal conundrum, for sure. But as technically minded people, we naturally search for technical solutions. And what I hear all the time from CTOs is the question whether using encryption of data could offer a quick solution. Well, can it?

What is encryption and why could it help?

Encryption uses mathematical algorithms that operate on data and an encryption key to encrypt data which is very very hard to process in encrypted form. To read it requires access to both the encrypted data and the secret encryption key. Technically speaking, it is not impossible without the secret key, but would require a truly infeasible amount of processing power to do. 

It’s like all kinds of security: you want to make it sufficiently hard to steal your valuables so that people won’t bother doing it. If my bike lock looks harder to pick than the next bike’s lock, a bike thief will steal that one before mine. Similarly, if my data is encrypted such that it statistically takes a fleet of computers thousands of years to decrypt using brute force, I may as well consider it safe forever. And because it is so, it is then also only natural that CTOs ask the “but what if we just use encryption” question in relation to the GDPR.

Yes, encryption is sufficient for backups, but…

The European Data Protection Board (EDPB) recently issued a recommendation that states that encryption is actually a sufficient protection for data that is protected by GDPR. Case closed, then, right? Not quite. It is deemed sufficient for storing backups of data. As long as you have used state of the art algorithms and parameters (yielding difficult enough encryption), and store only the encrypted data at a, for instance, US cloud provider. Not also the key, of course. So encrypted backups? Fine! Read “Use Case 3” on page 32 of the PDF for more details

But storing encrypted data for backup purposes is just a tiny use case that the cloud enables. What most CTOs really care about is whether we can use the great services that e.g. AWS offers to do simple processing of data on a pay-as-you-go model.

…encryption is not sufficient for processing data

I hate having to bring bad news, but here we go.

For processing data using cloud services, you’re looking at “Use case 6: Transfer to cloud services providers or other processors which require access to data in the clear” on page 34. And that one states that there are no technical measures, neither encryption nor pseudonymization (replacing all personally identifiable information such that you can still see that some data belongs to person “X”, but not who “X” is), that can offer sufficient protection.

So no server logs sent to CloudWatch Logs, no images processed by Rekognition, no personally identifiable data stored in RDS. No to all of them. No running your Kubernetes clusters in EKS and deploying your applications to them. Because all of those services need to process the personally identifiable information in the clear (unencrypted form).

Final verdict: encryption only realistically works for backups

So the recommendation is crystal clear that if data has to be processed in its unencrypted form, “third country” (outside of European Economic Area) based cloud providers cannot be used. If the data is just transferred and stored in its encrypted form only for backup purposes, then that’s considered OK, given strong and sound encryption.

However, do note that the EDPB is forward-thinking enough to say that if it would be possible to process the data in encrypted form, such that it stays encrypted during all of processing, that would be fine. And while that may sound like a futuristic dream, it’s actually already a thing: fully homomorphic encryption. It just happens to be significantly slower than processing data in the clear, and is not supported by any mainstream cloud service you have heard of, so it’s mostly academic at this point. I wouldn’t hold my breath until these things come and save us.

Where to go from here

What to do now, then? Well, first of all, you figure out exactly what data can or cannot be processed by US cloud services. You need that data inventory. Personally identifiable information is key here. For the parts that can be processed in the clear by US cloud services, you can with confidence just keep doing what you’ve probably already been doing. And for the parts that are covered by the GDPR, you make sure to use EEA-based cloud providers instead. This probably means looking less toward services offered by cloud providers themselves (clouds in the EEA are not as fully-featured as their US counterparts yet), and more toward what you can get using open source and cloud native tools.

It sounds like a lot of work. I know that. But all this is not some optional feature to implement some rainy day, it’s actually the law.

And for the CTOs from the US reading this, you are not off the hook. The GDPR actually covers the rights of EU citizens, wherever they are in the world. So you need to handle this, too. A Swede living in LA is just as covered by the GDPR as one living in Sweden.

If you need a technologically focused therapy session after reading all this, send me an email at lars.larsson@elastisys.com and let’s talk.