In 2016, Deis (now part of Microsoft) platform architect Matt Butcher was looking for a way to explain Kubernetes to technical and non-technical people alike. Inspired by his daughter’s prolific stuffed animal collection, he came up with the idea of “The Illustrated Children’s Guide to Kubernetes.” Thus Phippy, the yellow giraffe and PHP application, along with her friends, were born.
Today we’re excited to welcome Phippy and the cast of snuggly, cloud native characters into CNCF. As Kubernetes continues to see unprecedented momentum, her story offers developers an easy way to explain their work to parents, friends, and children.
Today, live from the keynote stage at KubeCon + CloudNativeCon North America, Matt and co-author Karen Chu announced Microsoft’s donation and presented the official sequel to the Children’s Illustrated Guide to Kubernetes in their live reading of “Phippy Goes to the Zoo: A Kubernetes Story” – the tale of Phippy and her niece as they take an educational trip to the Kubernetes Zoo.
As part of Microsoft’s donation of both books and the characters Phippy, Goldie, Captain Kube, and Zee, CNCF has licensed all of this material under the Creative Commons Attribution License (CC-BY), which means that you can remix, transform, and build upon the material for any purpose, even commercially. If you use the characters, please include the text “phippy.io” to provide attribution (and online, please include a link to https://phippy.io). The characters were created by Matt Butcher, Karen Chu, and Bailey Beougher. Goldie is based on the Go Gopher, created by Renee French, which is also licensed under CC-BY. Images of the characters are available in the CNCF artwork repo in svg, png, and ai formats and in color, black, and white.
Now that Phippy and her cloud native friends have made CNCF their home, make sure to keep an eye out for the fun adventures the characters will find themselves on as the Kubernetes global community continues to grow!
etcd is a distributed key value store that provides a reliable way to store data across a cluster of machines with best-of-class stability, reliability, scalability, and performance. The project – frequently teamed with applications such as Kubernetes, M3, Vitess, and Doorman – handles leader elections during network partitions and will tolerate machine failure, including the leader.
“etcd acts as a source of truth for systems like Kubernetes,” said Brian Grant, TOC representative and project sponsor, Principal Engineer at Google, and Kubernetes SIG Architecture Co-Chair and Steering Committee member. “As a critical component of every cluster, having a reliable way to automate its configuration and management is essential. etcd offers the necessary coordination mechanisms for cloud native distributed systems, and is cloud native itself.”
All Kubernetes clusters use etcd as their primary data store. As such, it handles storing and replicating data for Kubernetes cluster state and uses the Raft consensus algorithm to recover from hardware failure and network partitions. In addition to Kubernetes, Cloud Foundry also uses etcd as their distributed key-value store. This means etcd is used in production by companies such as Ancestry, ING, Pearson, Pinterest, The New York Times, Nordstrom, and many more.
“Alibaba uses etcd for several critical infrastructure systems, given its superior capabilities in providing high availability and data reliability,” said Xiang Li, senior staff engineer, Alibaba. “As a maintainer of etcd we see the next phase for etcd to focus on usability and performance. Alibaba looks forward to continuing co-leading the development of etcd and making etcd easier to use and more performant.”
“AWS is proud to have dedicated maintainers of etcd on our team to help ensure the bright future ahead for etcd. We look forward to continuing work alongside the community to continue the project’s stability,” said Deepak Singh, Director of Container Services at AWS.
“The Certificate Transparency team at Google works on both implementations and standards to fundamentally improve the security of internet encryption. The Open Source Trillian project is used by many organizations as part of that effort to detect the mis-issuance of trusted TLS certificates on the open internet”, says Al Cutter, the team lead, “And, etcd continues to play a role in the project by safely storing API quota data to protect Trillian instances from abusive requests, and reliably coordinating critical operations.”
Written in Go, etcd has unrivaled cross-platform support, small binaries, and a thriving contributor community. It also integrates with existing cloud native tooling like Prometheus monitoring, which can track important metrics like latency from the etcd leader and provide alerting and dashboards.
“Kubernetes and many other projects like Cloud Foundry depend on etcd for reliable data storage. We’re excited to have etcd join CNCF as an incubation project and look forward to cultivating its community by improving its technical documentation, governance and more,” said Chris Aniszczyk, COO of CNCF. “etcd is a fantastic addition to our community of projects.”
“When we introduced etcd during the early days of CoreOS, we wanted it to be this ubiquitously available component of a larger system. Part of the way you get ubiquity is to get everyone using it, and etcd hit critical mass with Kubernetes and has extended to many other projects and users since. As etcd goes into the CNCF, maintainers from Amazon, Alibaba, Google Cloud, and Red Hat all have nurtured the project as its user base has grown. In fact, etcd is deployed in every major cloud provider now and is a part of products put forward by all these companies and across the cloud native ecosystem,” said Brandon Philips, CTO of CoreOS at Red Hat. “Having a neutral third party stewarding the copyrights, DNS and other project infrastructure is the reasonable next step for the etcd project and users.”
Other common use cases of etcd include storing important application configuration like database connection details or feature flags as key value pairs. These values can be watched, allowing the application to reconfigure itself when changed. Advanced uses take advantage of the consistency guarantees to implement database leader elections or do distributed locking across a cluster of workers.
Main etcd Features:
Easily manages cluster coordination and state management across any distributed system
Written in Go and uses the Raft consensus algorithm to manage a highly available replicated log
Enables reliable distributed coordination through distributed locking, leader election, and write barriers
Handles leader elections during network partitions and will tolerate machine failure, including the leader
Includes a multi-version concurrency control data model
Empowers reliable key monitoring, which never silently drop events
Lease primitives for expiring keys
21,627 GitHub stars
9 maintainers representing 8 companies
As a CNCF hosted project, joining Incubated technologies like OpenTracing, Fluentd, Linkerd, gRPC, CoreDNS, containerd, rkt, CNI, Envoy, Jaeger, Notary, TUF, Vitess, NATS Helm, Rook and Harbor, etcd is part of a neutral foundation aligned with its technical interests, as well as the larger Linux Foundation, which provide governance, marketing support, and community outreach.
Every CNCF project has an associated maturity level: sandbox, incubating, or graduated project. For more information on what qualifies a technology for each level, please visit the CNCF Graduation Criteria v.1.1.
Now that we’ve finally caught our breath after a fantastic two days at the KubeCon + CloudNativeCon in Shanghai, let’s dive into some of the key highlights and news. The best part is we get to see so many of you so soon again at KubeCon + CloudNativeCon Seattle in December!
The sold-out event with more than 2,500 attendees (technologists, maintainers and end users of CNCF’s hosted projects) was full of great keynotes, presentations, discussions and deep dives on projects including Rook, Jaeger, Kubernetes, gRPC, containerd – and many more! Attendees had the opportunity to hear a slew of compelling talks from CNCF project maintainers, community members and end users including eBay, JD.com and Alibaba.
After hosting successful events in Europe and North America the past few years, it’s no wonder China was the next stop on the tour. Asia has seen a spike in of cloud native adoption to the tune of 135% since March 2018. You can find more information about the tremendous growth of cloud usage in China from CNCF’s Survey.
The conference kicked off by welcoming 53 new global members and end users to the Foundation, including several from China such as R&D China Information Technology, Shanghai Qiniu Information Technologies, Beijing Yan Rong Technology and Yuan Ding Technology. The CNCF has 40 members in China, which represents a little more than 10% of the CNCF’s total membership.
During the conference there were many great sessions on AI as well as a Kubernetes AI panel discussion that included product managers, data scientists, engineers, and architects from Google, CaiCloud, eBay, and JD.com
In further exciting China news, Harbor, the first project contributed by VMware to CNCF, successfully moved from sandbox into incubation. “Harbor is not only the first project donated by VMware to CNCF, but it is also the first Chinese program developed in the Chinese open source community donated to CNCF.”
– More details will be shared on an integration with Harbor in KubeCon Seattle
CNCF Diversity Scholarship
As always, the Foundation offered scholarships to members of traditionally underrepresented groups in the technology and/or open source communities. With $122,000 in diversity scholarship funds available, travel and/or conference registration was covered for 68 recipients KubeCon attendees.Scholarship funding was provided by AWS, CNCF, Google, Helm Summit, Heptio and VMware.
Missed out on KubeCon + CloudNativeCon Shanghai? Don’t worry as you have more chances to attend. KubeCon + CloudNativeCon North America 2018 is taking place in Seattle, WA from December 10-13. The Conference is currently sold out, but if you’d like to be added to the waitlist, fill out this form. You will be notified as new spots become available.
Also, massive kudos to the translators at #KubeCon + #CloudNativeCon who have been ON POINT. The translator for the Service Mesh panel was capturing and relaying a boatload of technical details in a staggeringly fast amount of time.
Creating serverless applications is a multi-step process. One of the critical steps in this process is packaging the serverless functions you want to deploy into your FaaS (Function as a Service) platform of choice.
Before a function can be deployed it needs two types of dependencies: direct function dependencies and runtime dependencies. Let’s examine these two types.
Direct function dependencies – These are objects that are part of the function process itself and include:
The function source code or binary
Third party binaries and libraries the function uses
Application data files that the function directly requires
Runtime function dependencies – This is data related to the runtime aspects of your function. It is not directly required by the function but configures or references the external environment in which the function will run. For example:
Event message structure and routing setup
Runtime binaries such as OS-level libraries
External services such as databases, etc.
For a function service to run, all dependencies, direct and runtime, need to be packaged and uploaded to the serverless platform. Today, however, there is no common spec for packaging functions. Serverless package formats are vendor-specific and are highly dependent on the type of environment in which your functions are going to run. This means that, for the most part, your serverless applications are locked-down to a single provider, even if your function code itself abstracts provider-specific details.
This article explores the opportunity to create a spec for an open and extensibile “Function Package” that enables the deployment of a serverless function binary, along with some extra metadata, across different FaaS vendors.
Function runtime environments
When looking at today’s common runtime FaaS providers, we see two mainstream approaches for running functions: container function and custom function runtimes. Let’s have a closer look at each type.
Container function runtimes
This method uses a container-based runtime, such as Kubernetes, and is the common runtime method for on-prem FaaS solutions. As its name suggests, it usually exposes container runtime semantics to users at one level or another. Here’s how it works.
A container entry-point is used as a function, together with an eventing layer, to invoke a specific container with the right set of arguments. The function is created with docker buildor any other image generator to create an OCI container image as the final package format.
The main issue with container functions is that developing a function often requires understanding of the runtime environment; developers need to weave their function into the container. Sometimes this process is hidden by the framework, but it is still very common for the developer to need to write a Dockerfile for finer control over operating system-level services and image structure. This leads to high coupling of the function with the runtime environment.
Custom function runtimes
Custom function runtime are commonly offered by cloud providers. They offer a “clean” model where functions are created by simply writing handler callbacks in your favorite programming language. In contrast to container-based functions, runtime details are left entirely to the cloud provider (even in cases where, behind-the-scenes, the runtime is based on containers).
Custom runtime environments use a language-centric model for functions. Software languages already have healthy packaging practices (such as Java jars, npm packages or Go modules). Still, no common packaging format exists for their equivalent functions. This creates a cloud-vendor lock-in. Some efforts have been made to define a common deployment model, (such as AWS Serverless Application Model (SAM) and the open-source Serverless Framework), but these models assume a custom binary package already exists or they may include the process of building it to each cloud provider standards.
The need for a Function Package
To be able to use functions in production, we need to use stable references to them to enable repeatability in deploying, upgrading and rolling back to a specific function version. This can be achieved with an immutable, versioned, sealed package that contains the function with all its direct dependencies.
Container images may meet these requirements because they offer a universal package format. However, there’s a side effect of “polluting” the function by tightly coupling it with details of the container runtime. Custom runtimes also exhibit coupling with their function packages. While they offer clean functions, they use proprietary package formats that mix the function dependencies with the runtime dependencies.
What we need is a clear separation: a clean function together with its dependencies in a native “Function Package” separate from external definitions of runtime-specific dependencies. This separation would allow us to take a function and reuse it across different serverless platforms, only adding external configuration as needed.
Getting a function to run
Let us reexamine for a moment the steps required to build and run a container-based function. We can look at this as a four step process:
Build a container image together with (direct + runtime) function dependencies and entry point definition
Push the image to a container registry so that we have a stable reference to it
Pull the image to the runtime and configure it according to runtime-specific dependencies
Accept function events at the defined entry point
This process works pretty well for container runtimes. We can try to formulate it into a more generalized view:
Build a function package that contains its direct dependencies (or references to them) and entry point definition
Upload the package to a package registry
Download the package (and its direct dependencies) to the runtime and configure runtime-specific dependencies
Accept function events at the defined entry point
Creating a function package
This general process can be applied across many serverless providers! Developers only need to worry about creating a clean Function Package and keeping it persistent. Runtime configuration can be provided upon installation and can be vendor-specific.
A Function Package would contain:
Function files – in source code or binary format
Direct dependencies – libraries and data files
Entry point definition
The experience for the developer is simple and programming-focused, and therefore we can create spec “profiles” that map to a specific language type. For example:
Go module source files
Build file references to generic files run by the entry-point
Generated pom file for the jar with a full flat list of dependencies + data files
go.mod file containing the dependencies + data files
Build file references to base image + other service files and data
Entry point definition
“main” package reference
Build file references to image entry point
What about dependencies?
A function package does not necessarily have to physically embed binary dependencies if they can be reliably provided by the runtime prior to installation. Instead, only references to dependencies could be declared. For example, external JAR coordinates declared in a POM file, or go packages declared in a go.modfile. These dependencies would be pulled by the runtime during installation – similar to how a container image is pulled from a docker registry by function runtimes.
By using stable references, we guarantee repeatability and reuse. We also create lighter packages that allow for quicker and more economic installation of functions by dependencies from a registry closer to the runtime, not having to reupload them each time.
Creating profiles for common runtime languages allows functions to be easily moved across different serverless FaaS providers, adding vendor-specific information pertaining to runtime configuration and eventing only at the installation phase. For application developers, this means they can avoid vendor lock-in and focus on writing their applications’ business logic without having to become familiar with operational aspects. This goal can be achieved more quickly by creating shareable Function Packages as a CNCF Serverless spec. If you are interested in discussing more – let’s talk!
KubeCon + CloudNativeCon has expanded from its start with 500 attendees in 2015 to become one of the largest and most successful open source conferences ever. With that growth comes challenges, and CNCF is eager to evolve the conference over time to best serve the cloud native community. Our upcoming event in Seattle (December 10-13, 2018), our biggest yet, is sold out several weeks ahead of time with 7,500 attendees.
From the start and throughout this growth, we’ve appreciated the feedback and input the community has shared. We carefully review the post-event surveys and listen closely to suggestions and new ideas. This feedback loop is crucial and allows us to iterate and improve.
As we open the call for proposals (CFP) for Barcelona (May 20-23, 2019), we want to share several changes we’re planning to make in 2019, as well as some changes we considered but decided not to implement at this time. CNCF is part of the Linux Foundation (LF) and leverages the LF’s decade of experience running open source events, including more than 100 in 2018 with more than 30,000 attendees from more than 11,000 organizations and 113 countries. We’ve also received a lot of feedback from previous events, much of it laudatory and some with specific proposals for improvement.
Here are some changes we’re planning to implement in 2019:
CFPs will have room for longer submissions to encourage presenters to share additional background and technical information in their proposals.
We will be willing to provide additional feedback to submitters whose talks are not selected. This feedback will fall in a set of categories rather than be personalized.
We have improved our tooling to only accept a single CFP talk from each speaker (or two co-presenter talks), and are limiting submitters to two solo submissions, 4 co-presenter submissions, or a combination.
In the maintainer track offered to CNCF-hosted projects, the Kubernetes SIGs and working groups, and CNCF working groups, we’re offering the opportunity to combine the intro and deep dive sessions into a longer 80-minute session. (Note that maintainer talks do not count against the single CFP talk per speaker quota.)
We are introducing two new kinds of smaller events in 2019. Kubernetes Days will be single day, single track events targeted at regions with large numbers of developers who cannot necessarily travel easily to our premiere events in Europe, China, and North America. Cloud Native Community Days will be regional events run by community members and will provide additional opportunities for speakers, practitioners and end users to come together.
We are encouraging any of our partner summits that would like to try a double-blind talk submission process to do so.
Here are some of the core elements of how we run the event that we are not planning to change:
KubeCon + CloudNativeCon is a conference for developers and end users (broadly defined) to communicate about open source, cloud native technologies.
Talks are rated by a program committee of community leaders and highly-rated speakers from past conferences, organized by conference co-chairs. The program committee is selected by the conference co-chairs, who also select the tracks and make the final talk selections.
Whether a company is a sponsor of the event or a member of CNCF has no impact on whether talks from their developers are selected. The only exception is that each diamond sponsor (6 in total) gets a 5-minute sponsored keynote. The co-chairs are now working with all keynote speakers, including the sponsored ones, to avoid vendor pitches so that the talks resonate with a community audience.
All talks are about using and/or developing open source software. Although many speakers are employed by software vendors, the conference content is focused on working with open source, not vendor offerings.
Talks can discuss one of CNCF’s 19 graduated/incubating projects or 11 sandbox projects or any other open source technology that adds value to the cloud native ecosystem.
We remain committed to increasing the voice of those who have been traditionally underrepresented in tech.
We select community leaders to serve as conference co-chairs and represent the cloud native community. The co-chairs for Barcelona are Janet Kuo of Google and Bryan Liles of Heptio. They are in the process of selecting a program committee of around 80 experts, which includes project maintainers, active community members, and highly-rated presenters from past events. Program committee members register for the topic areas they’re comfortable covering, and CNCF staff randomly assign a subset of relevant talks to each member. We then collate all of the reviews and the conference co-chairs spend a very challenging week assembling a coherent set of topic tracks and keynotes from the highest-rated talks. Here are the scoring guidelines we provide to the program committee. There is not a one-to-one mapping of topic areas to session tracks. We look to the conference co-chairs to craft a program that reflects current trends and interests in the cloud native community.
The above process is used to select the ~180 CFP sessions, which are offered in ~10 rooms. The keynote talks are selected by the conference co-chairs from highly-rated CFP submissions, or in rare cases, by invitation of the co-chairs to specific speakers.
In addition, KubeCon + CloudNativeCon also includes ~90 maintainer sessions spread across ~5 rooms. This is content produced by the maintainers of CNCF-hosted projects to inform users about the projects, add new adopters, and transition some of them from users to contributors. Sessions in the maintainer track are open to each of CNCF’s (29) hosted projects, the Kubernetes SIGs and working groups, and CNCF working groups. Each of these can do one 35-minute Intro and one 35-minute Deep Dive session. New for 2019, we’re offering to schedule these back-to-back to enable one 80-minute session.
Another fast-growing part of KubeCon + CloudNativeCon is the partner summits held the day before the event. This is an opportunity for projects and companies in the cloud native community to engage with KubeCon + CloudNativeCon attendees. For Seattle, there are 27 separate events! They range from community-organized events like the Kubernetes Contributor Summit and EnvoyCon to member-organized summits to open source projects from adjacent communities like networking initiatives FD.io and Tungsten Fabric. The content and pricing of these events are determined by the organization that runs each one.
The Review Process
Submissions for the CFP sessions are selected in a single-blind process. That is, the reviewer can see information on who is proposing the talk but the submitter does not see who reviewed their submission. Some academic conferences have switched to double-blind submissions, where the submitter removes all identifying information from their submission and the reviewers judge it based solely on the quality of the content. The downside is that it would require significantly more detailed submissions.
Submissions for Barcelona consist of a title and up to a 900 character description, which is used in the schedule if the talk is selected. There is an additional Benefits to the Ecosystem section of up to 1,500 characters to make the case for the submission (this is up significantly from the 300 characters allowed in 2018). To support double-blind selection, we would need to require submissions of 9,000 characters (~3 pages) or more, which is typical of academic-style conferences to encourage effective review. We believe this would discourage many of the practitioners and end users of cloud native technologies from submitting, and more talks would come from academics and those with the time and proclivity to make longer submissions.
This has pros and cons, but it would be a very significant change and unprecedented among open source conferences run by the LF. We considered testing a double-blind process with one topic area (such as service mesh) but decided that it would be too big of a change for an unknown improvement. Instead, we are encouraging any of our partner summits that would like to try a double-blind talk submission process to do so. The LF Events staff is happy to work with them to organize such a process for Barcelona or future events, and if the results go well, we might expand to one or more tracks at a future KubeCon + CloudNativeCon.
For Seattle, the acceptance rate was only 13%, which we understand creates a lot of disappointment and frustration when a very good talk is not accepted. For 2019, we will be providing additional feedback to submitters whose talks are not selected. This feedback will fall in a set of categories such as “not in top half of scores submitted” and “highly-rated but a similar talk was accepted instead.”
How to Get Your Talk Accepted
Whether a company is a member or end user supporter of CNCF or is sponsoring the event has no impact on whether talks from their developers will be selected. The only exception is that the 6 diamond sponsors each get a 5-minute sponsored keynote. However, being a community leader does have an impact, as program committee members will often rate talks from the creators or leaders of an open source project more highly.
Avoid the common pitfall of submitting a sales or marketing pitch for your product or service, no matter how compelling it is. Focus on your work with an open source project, whether it is one of the CNCF’s 29 hosted projects or a new project that adds value to the cloud native ecosystem.
KubeCon + CloudNativeCon is fundamentally a community conference focusing on the development and deployment of cloud native open source projects. So, pick your presenter and target audience accordingly. Our participants range from top experts to total beginners, so we explicitly ask what level of technical difficulty your talk is targeted for (beginner, intermediate, advanced, or any) and aim to provide a range.
We often get many submissions covering almost the same concept, so even if there are several great submissions, the co-chairs will probably only pick one. Consider choosing a more unique topic that is relevant, but less likely to be submitted by multiple people.
Our community is particularly interested in end users adopting cloud native technology. End users are companies that use cloud native technologies internally but do not sell any cloud native services externally. End users generally do not have a commercial product on the Cloud Native Landscape, though they may have created an open source project to share their internal technology. For more information, please see the kinds of companies in CNCF’s End User Community. If you don’t work for an end user company, consider co-presenting with an end user who has adopted your technology.
Given that talk recordings are available on YouTube, and there is very limited space on the agenda, avoid submissions that were already presented at a previous KubeCon + CloudNativeCon or any other event. If your submission is similar to a previous talk, please include information on how this version will be different. Make sure your presentation is timely, relevant, and new.
We’ve improved our tooling to only accept a single CFP talk from each speaker, and are limiting submitters to two submissions. Specifically, we’re counting being a co-presenter as 0.5 of a talk, and limiting submissions to 2.0 talks in all. So, at most, you can submit as a co-presenter on 4 talks, a solo presenter on 2 talks, or as a solo presenter on 1 and a co-presenter on 2.
Look through the talks that were selected for Copenhagen and Seattle and notice that most have clear, compelling titles and descriptions. The CFP form has a section for including resources that will help reviewers assess your submission. If you have given a talk before that was recorded, please include a link to it. Blog posts, code repos, and other contributions can also help establish your credentials, especially if this will be your first public talk (and we encourage first-time speakers to apply).
Finally, we are explicitly interested in increasing the voice of those who have been traditionally underrepresented in tech. All submissions are reviewed on merit, but we remain dedicated to having a diverse and inclusive conference and we will continue to actively take this into account when finalizing the list of speakers and the overall schedule. For example, we don’t accept panel proposals where all speakers are men. We also provide diversity scholarships to offset travel costs.
Other Cloud Native Conferences
In 2019, we plan to continue to hold three KubeCon + CloudNativeCon events, in Barcelona (May 20-23, 2019), Shanghai (June 24-26, 2019), and San Diego (November 18-21, 2019). In addition, we support 160 Meetup groups in 38 countries, which have hosted more than 1,600 events and have more than 80,000 members.
New for 2019, we are going to launch two new kinds of smaller events. Kubernetes Days will be single-day, single-track events targeted at regions with large numbers of developers who cannot necessarily travel easily to our premiere events in Europe, China, and North America. The first one will be held in Bengaluru, India on March 23, 2019.
In addition, we are planning to support a set of community-organized events called Cloud Native Community Days. These will be regional events run by community members in those areas and provide additional opportunities for speakers, practitioners and end users to come together. We’ll have more details about these programs early in 2019.
Barcelona and Shanghai Submissions
The CFP for Barcelona is open now and the deadline is January 18, 2019. The deadline for submitting talks to KubeCon + CloudNativeCon Shanghai (June 24-26, 2019) will be February 1, 2019. The submission and selection processes are separate. If you submit the same talk to both and it is accepted for one, it will be rejected from the other, so we encourage you to submit different content to each conference.
If you have questions on our processes for selecting talks or ideas on how to improve them, or other thoughts on CNCF events, please reach out to me at Dee Kumar <email@example.com> or book a time for us to speak at https://calendly.com/deekumar.
JD.com, China’s largest retailer, has been presented with the Top End User Award by the Cloud Native Computing Foundation (CNCF) for its unique usage of cloud native open source projects. The award was announced at China’s first KubeCon + CloudNativeCon conference hosted by CNCF, which gathered thousands of technologists and end users in Shanghai from November 13-15 to discuss the future of open source technology development.
Providing the ultimate e-commerce experience to customers requires JD to house and process enormous amount of information that must be accessible at incredibly fast speeds. To put it in perspective, five years ago there were only about two billion images in JD’s product databases for customers. Today, there are more than one trillion, and that figure increases by 100 million images each day. This is why JD turned to CNCF’s Kubernetes project in recent years to accommodate its clusters.
JD currently runs the world’s largest Kubernetes cluster in production. The company first rolled out its containerized infrastructure a few years ago and, as the clusters grew, JD was one of the early adopters to shift to Kubernetes. The move, known as JDOS 2.0, marked the beginning of JD’s partnership with CNCF to build stronger collaborative relationships with the industry’s top developers, end users, and vendors. Ultimately, CNCF provided a window for JD to both contribute to and benefit from open source development.
In April, JD became the CNCF’s first platinum end user member, and took a seat on the organization’s governance board in order to help shape the direction of future Foundation initiatives. JD’s overall commitment to open source is highly aligned with its broader Retail as a Service strategy in which the company is empowering other retailers, partners, and industries with a broad range of capabilities in order to increase efficiency, reduce costs, and provide a higher level of customer service.
JD’s Kubernetes clusters support a wide range of workloads and big data and AI-based applications. The platform has boosted collaboration and enhanced productivity by reducing silos between operations and DevOps teams. As a result, JD has contributed code to projects such as Vitess, Prometheus, Kubernetes, CNI (Container Networking Interface), and Helm as part of its collaboration with CNCF.
“One contribution that we are very proud of is Vitess, the CNCF project for scalable MySQL cluster management,” said Haifeng Liu, chief architect, JD.com. “We are not only the largest end user of Vitess, but also a very active and significant contributor. We’re looking forward to working together with CNCF and its members to pave the way for future development of open source technology.”
Vitess allows JD to manage resources much more flexibly and efficiently, reducing operational and maintenance costs, and JD has one of the world’s most complex Vitess deployments. The company is actively collaborating with the CNCF community to add new features such as subquery support and global transactions, setting industry benchmarks.
“JD spearheads the use of cloud native technologies at scale within the APAC market, and is responsible for one of the largest Kubernetes deployments in the world,” said Chris Aniszczyk, COO of Cloud Native Computing Foundation. “The company also makes significant contributions to CNCF projects and its involvement in the community made JD a natural fit for this award.”
JD will continue to work on contributions to cloud native technologies as well as release its own internal and homegrown open source projects to empower others in the community.
Harbor started in 2014 as a humble internal project meant to address a simple use case: storing images for developers leveraging containers. The cloud native landscape was wildly different and tools like Kubernetes were just starting to see the light of the day. It took a few years for Harbor to mature to the point of being open sourced in 2016, but the project was a breath of fresh air for individuals and organizations attempting to find a solid container registry solution. We were confident Harbor was addressing critical use cases based on its strong growth in user base early on.
We were incredibly excited when Harbor was accepted to the Cloud Native Sandbox in the summer of 2018. Although Harbor had been open sourced for some years by this point, having a vendor-neutral home immediately impacted the project resulting in increased engagement via our community channels and GitHub activity.
There were many things we immediately began tackling after joining the Sandbox, including addressing some technical debt, laying out a roadmap based solely on community feedback, and expanding the number of contributors to include folks that have consistently worked on improving Harbor from other organizations. We’ve also started a bi-weekly community call where we hear directly from Harbor users on what’s working well and what’s not. Finally, we’ve ratified a project governance model that defines how the project operates at various levels.
Given Harbor’s already-large global user base across organizations small and large, proposing the project mature into the CNCF Incubator was a natural next step. The processes around progressing to Incubation are defined here. In order to be considered, certain growth and maturity characteristics must first be demonstrated by the project:
Production usage: There must be users of the project that have deployed it to production environments and depend on its functionality for their business needs. We’ve worked closely with a number of large organizations leveraging Harbor the last number of years, so: check!
Healthy maintainer team: There must be a healthy number of members on the team that can approve and accept new contributions to the project from the community. We have a number of maintainers that founded the project and continue to work on it full time, in addition to new maintainers joining the party: check!
Healthy flow of contributions: The project must have a continuous and ongoing flow of new features and code being submitted and accepted into the codebase. Harbor released v1.6 in the summer of 2018, and we’re on the verge of releasing v1.7: check!
CNCF’s Technical Oversight Committee (TOC) evaluated the proposal from the Harbor team and concluded that we had met all the required criteria. It is both deeply humbling and an honor to be in the company of other highly-respected incubated projects like gRPC, Fluentd, Envoy, Jaeger, Rook, NATS, and more.
What’s Harbor anyway?
Harbor is an open source cloud native registry that stores, signs, and scans container images for vulnerabilities.
Harbor solves common challenges by delivering trust, compliance, performance, and interoperability. It fills a gap for organizations and applications that cannot use a public or cloud-based registry, or want a consistent experience across clouds.
Harbor addresses the following common use cases:
On-prem container registry – organizations with the desire to host sensitive production images on-premises can do so with Harbor.
Vulnerability scanning – organizations can scan images before they are used in production. Images with failed vulnerability scans can be blocked from being pulled.
Image signing – images can be signed via Notary to ensure provenance.
Role-based Access Control – integration with LDAP (and AD) to provide user- and group-level permissions.
Image replication – production images can be replicated to disparate Harbor nodes, providing disaster recovery, load balancing and the ability for organizations to replicate images to different geos to provide a more expedient image pull.
The “Harbor stack” is comprised of various 3rd-party components, including nginx, Docker Distribution v2, Redis, and PostgreSQL. Harbor also relies on Clair for vulnerability scanning, and Notary for image signing.
The Harbor components, highlighted in blue, are the heart of Harbor and are responsible for most of the heavy lifting in Harbor:
Core Services provides an API and UI interface. Intercepts docker pushes / pulls to provide role-based access control and also to prevent vulnerables images from being pulled and subsequently used in production (all of this is configurable).
Admin service is being phased out for v1.7, with feature / functionality being merged into the core service.
Job Service is responsible for running background tasks (e.g., replication, one-shot or recurring vulnerability scans, etc.). Jobs are submitted by the core service and run in the job service component.
Currently Harbor is packaged via both docker-compose service definition and a Helm chart.
Harbor has continued an upward trajectory of community growth through 2018. The stats below visualize the consistent growth pre- and post-acceptance into the Cloud Native Sandbox:
Where we are
Harbor is both mature and production-ready. We know of dozens of large organizations leveraging Harbor in production, including at least one serving millions of container images to tens-of-thousands of compute nodes. The various components that comprise Harbor’s overall architecture are battle-tested in real-world deployments.
Harbor is API driven and is being used in custom SaaS and on-prem products by various vendors and companies. It’s easy to integrate Harbor in your environment, whether a customer-facing SaaS or an internal development pipeline.
The Harbor team strives to release quarterly. We’re currently working on our eight major release, v1.7, due out soon. Over the last two releases alone we’ve made marked strides in achieving our long terms goals:
Native support of Helm charts
Initial support for deploying Harbor via Helm chart
Refactoring of our persistence layer, now relying solely on PostgreSQL and Redis – this will help us achieve our high-availability goals
Added labels and replication filtering based on labels
Improvements to RBAC, including LDAP group-based access control
Architecture simplification (i.e., collapsing admin server component responsibilities into core component)
Where we’re going
This is the fun part. 🙂
Harbor is a vibrant community of users – those who use Harbor and publicly share their experiences, the individuals who report and respond to issues, the folks who hang around in our Slack community, and those who spend time on GitHub improving our code and documentation. We’re all incredible fortunate at the rich and exciting ideas that are proposed via GitHub issues on a regular basis.
We’re still working on our v1.8 roadmap, but here are some major features we’re considering and might land at some point in the future (timing to be determined, and contributions are welcome!):
Quotas – system- and project-level quotas; networking quotas; bandwidth quotas; user quotas; etc.
Replication – the ability to replicate to non-Harbor nodes.
Image proxying and caching – a docker pull would proxy a request to, say, Docker Hub, then scan the image before providing to developer. Alternatively, pre-cache images and block images that do not meet vulnerability requirements.
One-click upgrades and rollbacks of Harbor.
Clustering – Harbor nodes should cluster, replicate metadata (users, RBAC and system configuration, vulnerability scan results, etc.). Support for wide-area clustering is a stretch goal.
BitTorrent-backed storage – images are transparently transferred via BT protocol.
Please feel free to share your wishlist of features via GitHub; just open an issue and share your thoughts. We keep track of items the community desires and will prioritized based on demand.
How to get involved
Getting involved in Harbor is easy. Step 1: don’t be shy. We’re a friendly bunch of individuals working on an exciting open source project.
The lowest-barrier of entry is joining us on Slack. Ask questions, give feedback, request help, share your ideas on how to improve the project, or just say hello!
We love GitHub issues and pull requests. If you think something can be improved, let us know. If you want to spend a few minutes fixing something yourself – docs, code, error messages, you name it – please feel free to open a PR. We’ve previously discussed how to contribute, so don’t be shy. If you need help with the PR process, the quickest way to get an answer is probably to ping us on Slack.
See you on GitHub!
By James Zabala 詹姆斯 扎巴拉 Harbor始于2014年，是一个内部发起的项目，旨在解决一个简单的问题：帮助开发人员存储容器镜像。 当时云原生的版图和现在完全不同，像Kubernetes这样的工具刚刚开始引起注意。 直到2016年，Harbor逐渐成熟并开源，它为试图解决类似问题的个人和组织带来了新的选择。 而用户量的持续增长也让我们相信Harbor 解决了关键的问题。
The bi-annual CNCF survey takes a pulse of the community to better understand the adoption of cloud native technologies. This is the second time CNCF has conducted its cloud native survey in Mandarin to better gauge how Asian companies are adopting open source and cloud native technologies. The previous Mandarin survey was published in March 2018. This post also makes comparisons to the most recent North American / European version of this survey from August 2018.
Usage of public and private clouds in Asia has grown 135% since March 2018, while on-premise has dropped 48%.
Usage of nearly all container management tools in Asia has grown, with commercial off-the-shelf solutions up 58% overall, and home-grown solutions up 690%. Kubernetes has grown 11%.
The number of Kubernetes clusters in production is increasing. Organizations in Asia running 1-5 production clusters decreased 37%, while respondents running 11-50 clusters increased 154%.
Use of serverless technology in Asia has spiked 100% with 29% of respondents using installable software and 21% using a hosted platform.
300 people responded to the Chinese version with 83% being from Asia, compared to 187 respondents from our March 2018 survey.
CNCF has a total of 42 members across China, Japan, and South Korea including 4 platinum members: Alibaba Cloud, Fujitsu, Huawei, and JD.com. A number of these members are also end users, including:
Container usage is becoming prevalent in all phases of the development cycle. There has been a significant jump in the use of containers for testing, up to 42% from 24% in March 2018 with an additional 27% of respondents citing future plans. There has also been an increase in use of containers for Proof of Concept (14% up from 8%).
As the usage of containers becomes more prevalent across all phases of development, the use of container management tools is growing. Since March 2018, there has been a significant jump in the usage of nearly all container management tools.
Usage of Kubernetes has grown 11% since March 2018. Other tools have also grown:
Amazon ECS: up to 22% from 13%
CAPS: up to 13% from 1%
Cloud Foundry: up to 20% from 1%
Docker Swarm: up to 27% from 16%
Shell Scripts: up to 14% from 5%
There are also two new tools that were not cited in the March 2018 survey. 16% of respondents are using Mesos and an additional 8% are using Nomad for container management.
Commercial off-the-shelf solutions (Kubernetes, Docker Swarm, Mesos, etc.) have grown 58% overall, while home-grown management (Shell Scripts and CAPS) have grown 690%, showing that home-grown solutions are still widely popular in Asia while North American and European markets moved away from those in favor of COTS solutions.
Cloud vs. On-Premise
While on-premise solutions are widely used in the North American and European markets (64%), that number seems to be declining for the Asian market. Only 31% of respondents reported using on-premise solutions in this survey, compared to 60% in March 2018. Cloud usage is growing with 43% of respondents using private clouds (up from 24%) and 51% using public clouds (up from 16%).
As for where Kubernetes is being run, Alibaba still remains No. 1 with 38% of respondents reporting usage, but is down from 52% in March 2018. Following Alibaba, is Amazon Web Services (AWS) with 24% of respondents citing usage, slightly down from 26%.
New environments that were not previously reported and are taking up additional market share are Huawei Cloud (13%), VMware (6%), Baidu Cloud (21%), Tencent Cloud (7%), IBM Cloud (8%), and Packet (5%).
The decline of on-premise usage is also evident in these responses, with 24% of respondents reporting that they run Kubernetes on-prem compared to 38% in March 2018. OpenStack usage has also declined significantly, down to 9% from 26% in March 2018.
For organizations running Kubernetes, the number of production clusters is also increasing. Respondents running 1-5 production clusters decreased 37%, while respondents running 11-50 clusters increased 154%. Still, respondents are mostly running 6-10 production containers, with 29% reporting that number.
We also asked respondents about the tools they are using to manage various aspects of their applications:
The most popular method of packaging Kubernetes applications is Managed Kubernetes Offerings (37%), followed by Ksonnet (27%) and Helm (24%).
Respondents are primarily using autoscaling for Task and Queue processing applications (44%) and Java Applications (44%). This is followed by stateless applications (33%) and stateful databases (29%).
The top reasons respondents aren’t using Kubernetes autoscaling capabilities are because they are using a third party autoscaling solution (32%), were not aware these capabilities existed (30%), or have built their own solution to autoscale (26%).
The top Kubernetes ingress providers reported are F5 (36%), nginx (34%), and GCP (22%).
Exposing Cluster External Services
The most popular ways to expose Cluster External Services were Load-Balancer Services (43%), integration with a third party load-balancer (37%), and L7 Ingress (35%).
Separating Kubernetes in an Organization with Multiple Teams
Respondents are separating multiple teams within their organization using namespaces (49%), separate clusters (42%), and only labels (34%).
Separating Kubernetes Applications
Respondents are primarily separating their Kubernetes applications using separate clusters (45%), namespaces (46%), and only labels (33%).
Cloud Native Projects
What are the benefits of cloud native projects in production? Respondents cited the top four reasons as:
Improved Availability (47%)
Improved Scalability (46%)
Cloud Portability (45%)
Improved Developer Productivity (45%)
Compared to the North American and European markets, improved availability and developer productivity are more important in the Asian market, while faster deployment time is less important (only 38% cited this compared to 50% in the English version of this survey).
As for the cloud native projects that are being used in production and evaluated:
Many cloud native projects have grown in production usage since March 2018. Projects with the largest spike in production usage are: gRPC (22% up from 13%), Fluentd (11% up from 7%), Linkerd (11% up from 7%), OpenTracing (27% from 20%).
The number of respondents evaluating cloud native projects also grew with: gRPC (20% up from 11%), OpenTracing (27% up from 18%), and Zipkin (12% up from 9%).
As cloud native technologies continue to be adopted, especially into production, there are still challenges to address. The top challenges respondents are facing are:
Lack of training (46%)
Difficulty in choosing an orchestration solution (30%)
Finding vendor support (28%)
One interesting note is that many of the challenges have significantly declined since our previous survey in March 2018 as more resources are added to address these concerns. A new challenge that has come up is lack of training. While CNCF has invested heavily in Kubernetes training over the past year, including courses and certification for Kubernetes Administrators and Application Developers, we are still actively working to make translated versions of the courses and certifications available and more easily accessible in Asia. CNCF is also working with a global network of Kubernetes Training Partners to expand these resources, as well as Kubernetes Certified Service Providers to help support organizations with the complexity of embarking on their cloud native journey.
The use of serverless technology has spiked with 50% of organizations using the technology compared to 25% in March 2018. Of that 50%, 29% are using installable software and 21% are using a hosted platform. An additional 17% plan to use the technology within the next 12-18 months.
For installable serverless platforms, Apache OpenWhisk is the most popular with 11% of respondents citing usage. This is followed by Dispatch (6%), FN (5%) and OpenFaaS, Kubeless, and Fission tied at 4%.
For hosted serverless platforms, AWS Lambda is the most popular with 11% of respondents citing usage. This is followed by Azure Functions (8%), and Alibaba Cloud Compute Functions, Google Cloud Functions, and Cloudflare Functions tied at 7%.
Serverless usage in Asia is higher than what we saw in North American and European markets where 38% of organizations were using serverless technology. Hosted platforms (32%) were also much more popular compared to installable software (6%), whereas in Asia both options are more evenly used. There is also much more variety in the solutions used, whereas AWS Lambda and Kubeless were the clear leaders in North America and Europe.
Relating back to CNCF projects, a small percentage of respondents are now evaluating (3%) or using CloudEvents in production (2%). CloudEvents is an effort organized by CNCF’s Serverless Working Group to create a specification for describing event data in a common way.
Cloud Native is Growing in China
As cloud native continues to grow in China, the methods for learning about these technologies becomes increasingly important. Here are the top ways respondents are learning about cloud native technologies:
50% of respondents are learning through documentation.Each CNCF project hosts extensive documentation on their websites, which can be found here. Kubernetes, in particular, is currently working on translating their documentation and website across multiple languages including Japanese, Korean, Norwegian, and Chinese.
29% of respondents are learning through technical webinars. CNCF runs a weekly webinar series that takes place every Tuesday from 10am-11am PT. You can see the upcoming schedule and view recordings and slides of previous webinars.
As cloud native continues to grow in Asia, CNCF is excited to be hosting the first annual KubeCon + CloudNativeCon in Shanghai this week. With over 1,500 attendees at the inaugural event, we look forward to seeing the continued growth of cloud native technologies at a global scale.
To keep up with the latest news and projects, join us at one of the 22 cloud native Meetups across Asia. We hope to see you at one of our upcoming Meetups!
A huge thank you to everyone who participated in our survey!
You can also view the findings from past surveys here:
The pool of respondents represented a variety of company sizes with the majority being in the 50-499 employee range (48%). As for job function, respondents identified mostly as Developers (22%), Development Manager (15%), and IT Managers (12%).
Respondents represented 31 different industries, the largest being software (13%) and financial services (11%).
This survey was conducted in Mandarin. You can view additional demographics breakdowns below:
The cloud native landscape can be complicated and confusing. Its myriad of open source projects are supported by the constant contributions of a vibrant and expansive community. The Cloud Native Computing Foundation (CNCF) has a landscape map that shows the full extent of cloud native solutions, many of which are under their umbrella.
As a CNCF ambassador, I am actively engaged in promoting community efforts and cloud native education throughout Canada. At CloudOps I lead workshops on Docker and Kubernetes that provide an introduction to cloud native technologies and help DevOps teams operate their applications.
I also organize Kubernetes and Cloud Native meetups that bring in speakers from around the world and represent a variety of projects. They are run quarterly in Montreal, Ottawa, Toronto, Kitchener-Waterloo, and Quebec City. Reach out to me @archyufaor email CloudOps to learn more about becoming cloud native.
In the meantime, I have written a beginners guide to the cloud native landscape. I hope that it will help you understand the landscape and give you a better sense of how to navigate it.
The History of the CNCF
In 2014 Google open sourced an internal project called Borg that they had been using to orchestrate containers. Not having a place to land the project, Google partnered with the Linux Foundation to create the Cloud Native Computing Foundation (CNCF), which would encourage the development and collaboration of Kubernetes and other cloud native solutions. Borg implementation was rewritten in Go, renamed to Kubernetes and donated as the incepting project. It became clear early on that Kubernetes was just the beginning and that a swarm of new projects would join the CNCF, extending the functionality of Kubernetes.
The CNCF Mission
The CNCF fosters this landscape of open source projects by helping provide end-user communities with viable options for building cloud native applications. By encouraging projects to collaborate with each other, the CNCF hopes to enable fully-fledged technology stacks comprised solely of CNCF member projects. This is one way that organizations can own their destinies in the cloud.
A total of twenty-five projects have followed Kubernetes and been adopted by the CNCF. In order to join, projects must be selected and then elected with a supermajority by the Technical Oversight Committee (TOC). The voting process is aided by a healthy community of TOC contributors, which are representatives from CNCF member companies, including myself. Member projects will join the Sandbox, Incubation, or Graduation phase depending on their level of code maturity.
Sandbox projects are in a very early stage and require significant code maturity and community involvement before being deployed in production. They are adopted because they offer unrealized potential. The CNCF’s guidelines state that the CNCF helps encourage the public visibility of sandbox projects and facilitate their alignment with existing projects. Sandbox projects receive minimal funding and marketing support from the CNCF and are subject to review and possible removal every twelve months.
Projects enters the Incubation when they meet all sandbox criteria as well as demonstrate certain growth and maturity characteristics. They must be in production usage by at least three companies, maintain healthy team that approves and accepts a healthy flow of contributions that include new features and code from the community.
Once Incubation projects have reached a tipping point in production use, they can be voted by the TOC to have reached Graduation phase. Graduated projects have to demonstrate thriving adoption rates and meet all Incubation criteria. They must also have committers from at least two organizations, have documented and structured governance processes, and meet the Linux Foundation Core Infrastructure Initiative’s Best Practices Badge. So far, only Kubernetes and Prometheus have graduated.
The Projects Themselves
Below I’ve grouped projects into twelve categories: orchestration, app development, monitoring, logging, tracing, container registries, storage and databases, runtimes, service discovery, service meshes, service proxy, security, and streaming and messaging. And provided information that can be helpful for companies or individuals to evaluate what each project does, how project integrates with other CNCF projects and understand its evolution and current state.
Kubernetes (graduated) – Kubernetes automates the deployment, scaling, and management of containerised applications, emphasising automation and declarative configuration. It means helmsman in ancient Greek. Kubernetes orchestrates containers, which are packages of portable and modular microservices. Kubernetes adds a layer of abstraction, grouping containers into pods. Kubernetes helps engineers schedule workloads and allows containers to be deployed at scale over multi-cloud environments. Having graduated, Kubernetes has reached a critical mass of adoption. In a recent CNCF survey, over 40% of respondents from enterprise companies are running Kubernetes in production.
Helm(Incubating) – Helm is an application package manager that allows users to find, share, install, and upgrade Kubernetes applications (aka charts) with ease. It helps end users deploy existing applications (including MySQL, Jenkins, Artifactory and etc.) using KubeApps Hub, which display charts from stable and incubator repositories maintained by the Kubernetes community. With Helm you can install all other CNCF projects that run on top of Kubernetes. Helm can also let organizations create and then deploy custom applications or microservices to Kubernetes. This involves creating YAML manifests with numerical values not suitable for deployment in different environments or CI/CD pipelines. Helm creates single charts that can be versioned based on application or configuration changes, deployed in various environments, and shared across organizations.
Helm originated at Deis from an attempt to create a ‘homebrew’ experience for Kubernetes users. Helm V2 consisted of the client-side of what is currently the Helm Project. The server-side ‘tiller’, or Helm V2, was added by Deis in collaboration with Google at around the same time that Kubernetes 1.2 was released. This was how Helm became the standard way of deploying applications on top of Kubernetes.
Helm is currently making a series of changes and updates in preparation for the release of Helm V3, which is expected to happen by the end of the year. Companies that rely on Helm for their daily CI/CD development, including Reddit, Ubisoft, and Nike, have suggested improvements for the redesign.
Telepresence (Sandbox) – It can be challenging to develop containerized applications on Kubernetes. Popular tools for local development include Docker Compose and Minikube. Unfortunately, most cloud native applications today are resource intensive and involve multiple databases, services, and dependencies. Moreover, it can be complicated to mimic cloud dependencies, such as messaging systems and databases in Compose and Minikube. An alternative approach is to use fully remote Kubernetes clusters, but this precludes you from developing with your local tools (e.g., IDE, debugger) and creates slow developer “inner loops” that make developers wait for CI to test changes.
Telepresence, which was developed by Datawire, offers the best of both worlds. It allows the developer to ‘live code’ by running single microservices locally for development purposes while remaining connected to remote Kubernetes clusters that run the rest of their application. Telepresence deploys pods that contain two-way network proxies on remote Kubernetes clusters. This connects local machines to proxies. Telepresence implements realistic development/test environments without freezing local tools for coding, debugging, and editing.
Prometheus (Graduated) – Following in the footsteps of Kubernetes, Prometheus was the second project to join the CNCF and the second (and so far last) project to have graduated. It’s a monitoring solution that is suitable for dynamic cloud and container environments. It was inspired by Google’s monitoring system, Borgman. Prometheus is a pull-based system – its configurations decide when and what to scrape. This is unlike other monitoring systems using push-based approach where monitoring agent running on nodes. Prometheus stores scrapped metrics in a TSDB. Prometheus allows you to create meaningful graphs inside the Grafana dashboard with powerful query languages, such as PromQL. You can also generate and send alerts to various destinations, such as slack and email, using the built-in Alert Manager.
Hugely successful, Prometheus has become the de facto standard in cloud native metric monitoring. With Prometheus one can monitor VMs, Kubernetes clusters, and microservices being run anywhere, especially in dynamic systems like Kubernetes. Prometheus’ metrics also automate scaling decisions by leveraging Kubernetes’ features including HPA, VPA, and Cluster Autoscaling. Prometheus can monitor other CNCF projects such as Rook, Vitesse, Envoy, Linkerd, CoreDNS, Fluentd, and NATS. Prometheus’ exporters integrate with many other applications and distributed systems. Use Prometheus’ official Helm Chart to start.
OpenMetrics (Sandbox) – OpenMetrics creates neutral standards for an application’s metric exposition format. Its modern metric standard enables users to transmit metrics at scale. OpenMetrics is based on the popular Prometheus exposition format, which has over 300 existing exporters and is based on operational experience from Borgmon. Borgman enables ‘white-box monitoring’ and mass data collection with low overheads. The monitoring landscape before OpenMetrics was largely based on outdated standards and techniques (such as SNMP) that use proprietary formats and place minimal focus on metrics. OpenMetrics builds on the Prometheus exposition format, but has a tighter, cleaner, and more enhanced syntax. While OpenMetrics is only in the Sandbox phase, it is already being used in production by companies including AppOptics, Cortex, Datadog, Google, InfluxData, OpenCensus, Prometheus, Sysdig, and Uber.
Cortex (Sandbox) – Operational simplicity has always been a primary design objective of Prometheus. Consequently, Prometheus itself can only be run without clustering (as single nodes or container) and can only use local storage that is not designed to be durable or long-term. Clustering and distributed storage come with additional operational complexity that Prometheus forgoed in favour of simplicity. Cortex is a horizontally scalable, multi-tenant, long-term storage solution that can complement Prometheus. It allows large enterprises to use Prometheus while maintaining access to HA (High Availability) and long-term storage. There are currently other projects in this space that are gaining community interest, such as Thanos, Timbala, and M3DB. However, Cortex has already been battle-tested as a SaaS offering at both GrafanaLabs and Weaveworks and is also deployed on prem by both EA and StorageOS.
Logging and Tracing
Fluentd (Incubator) – Fluentd collects, interprets, and transmits application logging data. It unifies data collection and consumption so you can better use and understand your data. Fluentd structures data as JSON and brings together the collecting, filtering, buffering, and outputting of logs across multiple sources and destinations. Fluentd can collect logs from VMs and traditional applications, however it really shines in cloud native environments that run microservices on top of Kubernetes, where applications are created in a dynamic fashion.
Fluentd runs in Kubernetes as a daemonset (workload that runs on each node). Not only does it collects logs from all applications being run as containers (including CNCF ones) and emits logs to STDOUT. Fluentd also parses and buffers incoming log entries and sends formatted logs to configured destinations, such as Elasticsearch, Hadoop, and Mongo, for further processing.
Fluentd was initially written in Ruby and takes over 50Mb in memory at runtime, making it unsuitable for running alongside containers in sidecar patterns. Fluentbit is being developed alongside Fluentd as a solution. Fluentbit is written in C and only uses a few Kb in memory at runtime. Fluentd is more efficient in CPU and memory usage, but has more limited features than Fluentd. Fluentd was originally developed by Treasuredata as an open source project.
Fluentd is available as a Kubernetes plugin and can be deployed as version 0.12, an older and more stable version that currently is widely deployed in production. The new version (Version 1.X) was recently developed and has many improvements, including new plugin APIs, nanosecond resolution, and windows support. Fluentd is becoming the standard for log collection in the cloud native space and is a solid candidate for CNCF Graduation.
OpenTracing (Incubator) – Do not underestimate the importance of distributed tracing for building microservices at scale. Developers must be able to view each transaction and understand the behaviour of their microservices. However, distributed tracing can be challenging because the instrumentation must propagate the tracing context both within and between the processes that exist throughout services, packages, and application-specific code. OpenTracing allows developers of application code, OSS packages, and OSS services to instrument their own code without locking into any particular tracing vendor. OpenTracing provides a distributed tracing standard for applications and OSS packages with vendor-neutral APIs with libraries available in nine languages. These enforce distributed tracing, making OpenTracing ideal for service meshes and distributed systems. OpenTracing itself is not a tracing system that runs traces to analyze spans from within the UI. It is an API that works with application business logic, frameworks, and existing instrumentation to create, propagate, and tag spans. It integrates with both open source (e.g. Jaeger, Zipkin) or commercial (e.g Instana, Datadog) tracing solutions, and create traces that are either stored in a backend or spanned into a UI format. Click here to try a tutorialor start instrumenting your own system with Jaeger, a compatible tracing solution.
Jaeger has observatibility because it exposes Prometheus metrics by default and integrates with Fluentd for log shipping. Start deploying Jaeger to Kubernetes using a Helm chart or the recently developed Jaeger Operator. Most contributions to the Jaeger codebase come from Uber and RedHat, but there are hundreds of companies adopting Jaeger for cloud native, microservices-based, distributed tracing.
Harbor (Sandbox) – Harbor is an open source trusted container registry that stores, signs, and scans docker images. It provides free-of-charge, enhanced docker registry features and capabilities. These include a web interface with role-based access control (RBAC) and LDAP support. It integrates with Clair, an open source project developed by CoreOS, for vulnerability scanning and with Notary, a CNCF Incubation project described below, for content trust. Harbor provides activity auditing, Helm chart management and replicates images from one Harbor instance to another for HA and DR. Harbor was originally developed by VMWare as an open source solution. It is now being used by companies of many sizes, including TrendMicro, Rancher, Pivotal, and AXA.
Storage and Databases
Rook (Incubator) – Rook is an open source cloud native storage orchestrator for Kubernetes. With Rook, ops teams can run Software Distributed Systems (SDS) (such as Ceph) on top of Kubernetes. Developers can then use that storage to dynamically create Persistent Volumes (PV) in Kubernetes to deploy applications, such as Jenkins, WordPress and any other app that requires state. Ceph is a popular open-source SDS that can provide many popular types of storage systems, such as Object, Block and File System and runs on top of commodity hardware. While it is possible to run Ceph clusters outside of Kubernetes and connect it to Kubernetes using the CSI plugin, deploying and then operating Ceph clusters on hardware is a challenging task, reducing the popularity of the system. Rook deploys and integrates Ceph inside Kubernetes as a first class object using Custom Resource Definition (CRDs) and turns it into a self-managing, self-scaling, and self-healing storage service using the Operator Framework. The goal of Operatorsin Kubernetes is to encode human operational knowledge into software that is more easily packaged and shared with end users. In comparison to Helm that focuses on packaging and deploying Kubernetes applications, Operator can deploy and manage the life cycles of complex applications. In the case of Ceph, Rook Operator automates storage administrator tasks, such as deployment, bootstrapping, configuration, provisioning, horizontal scaling, healing, upgrading, backups, disaster recovery and monitoring. Initially, Rook Operator’s implementation supported Ceph only. As of version 0.8, Ceph support has been moved to Beta. Project Rook later announced Rook Framework for storage providers, which extends Rook as a general purpose cloud native storage orchestrator that supports multiple storage solutions with reusable specs, logic, policies and testing. Currently Rook supports CockroachDB, Minio, NFS all in alpha and in future Cassandra, Nexenta, and Alluxio. The list of companies using Rook Operator with Ceph in production is growing, especially for companies deploying on Prem, amongst them CENGN, Gini, RPR and many in the evaluation stage.
Vitess (Incubator) – Vitess is a middleware for databases. It employs generalized sharding to distribute data across MySQL instances. It scales horizontally and can scale indefinitely without affecting your application. When your shards reach full capacity, Vitess will reshard your underlying database with zero downtime and good observativability. Vitess solves many problems associated with transactional data, which is continuing to grow.
TiKV (Sandbox) – TiKV is a transactional key-value database that offers simplified scheduling and auto-balancing. It acts as a distributed storage layer that supports strong data consistency, distributed transactions, and horizontal scalability. TiKV was inspired by the design of Google Spanner and HBase, but has the advantage of not having a distributed file system. TiKV was developed by PingCAP and currently has contributors from Samsung, Tencent Cloud, and UCloud.
RKT (Incubator) – RKT (read as Rocket) is an application container runtime that was originally developed at CoreOS. Back when Docker was the default runtime for Kubernetes and was baked into kubelet, the Kubernetes and Docker communities had challenges working with each other. Docker Inc., the company behind the development of Docker as an open source software, had its own roadmap and was adding complexity to Docker. For example, they were adding swarm-mode or changing filesystem from AUFS to overlay2 without providing notice. These changes were generally not well coordinated with the Kubernetes community and complicated roadmap planning and release dates. At the end of the day, Kubernetes users need a simple runtime that can start and stop containers and provide functionalities for scaling, upgrading, and uptimes. With RKT, CoreOS intended to create an alternative runtime to Docker that was purposely built to run with Kubernetes. This eventually led to the SIG-Node team of Kubernetes developing a Container Runtime Interface (CRI) for Kubernetes that can connect any type of container and remove Docker code from its core. RKT can consume both OCI Images and Docker format Images. While RKT had a positive impact on the Kubernetes ecosystem, this project was never adopted by end users, specifically by developers who are used to docker cli and don’t want to learn alternatives for packaging applications. Additionally, due to the popularity of Kubernetes, there are a sprawl of container solutions competing for this niche. Projects like gvisor and cri-o (based on OCI) are gaining popularity these days while RKT is losing its position. This makes RKT a potential candidate for removal from the CNCF Incubator.
Containerd (Incubator) – Containerd is a container runtime that emphasises simplicity, robustness and portability. In contrast to RKT, Containerd is designed to be embedded into a larger system, rather than being used directly by developers or end-users. Similar to RKT containerd can consume both OCI and Docker Image formats. Containerd was donated to the CNCF by the Docker project. Back in the days, Docker’s platform was a monolithic application. However, with time, it became a complex system due to the addition of features, such as swarm mode. The growing complexity made Docker increasingly hard to manage, and its complex features were redundant if you were using docker with systems like Kubernetes that required simplicity. As a result, Kubernetes started looking for alternative runtimes, such as RKT, to replace docker as the default container runtime. Docker project then decided to break itself up into loosely coupled components and adopt a more modular architecture. This was formerly known as Moby Project, where containerd was used as the core runtime functionality. Since Moby Project, Containerd was later integrated to Kubernetes via a CRI interface known as cri-containerd. However cri-containerd is not required anymore because containerd comes with a built-in CRI plugin that is enabled by default starting from Kubernetes 1.10 and can avoid any extra grpc hop. While containerd has its place in the Kubernetes ecosystem, projects like cri-o (based on OCI) and gvisor are gaining popularity these days and containerd is losing its community interest. However, it is still an integral part of the Docker Platform.
CoreDNS (Incubator) – CoreDNS is a DNS server that provides service discovery in cloud native deployments. CoreDNS is a default Cluster DNS in Kubernetes starting from its version 1.12 release. Prior to that, Kubernetes used SkyDNS, which was itself a fork of Caddy and later KubeDNS. SkyDNS – a dynamic DNS-based service discovery solution – had an inflexible architecture that made it difficult to add new functionalities or extensions. Kubernetes later used KubeDNS, which was running as 3 containers (kube-dns, dnsmasq, sidecar), was prone to dnsmasq vulnerabilities, and had similar issues extending the DNS system with new functionalities. On the other hand, CoreDNS was re-written in Go from scratch and is a flexible plugin-based, extensible DNS solution. It runs inside Kubernetes as one container vs. KubeDNS, which runs with three. It has no issues with vulnerabilities and can update its configuration dynamically using ConfigMaps. Additionally, CoreDNS fixed a lot of KubeDNS issues that it had introduced due to its rigid design (e.g. Verified Pod Records). CoreDNS’ architecture allows you to add or remove functionalities using plugins. Currently, CoreDNS has over thirty plugins and over twenty external plugins. By chaining plugins, you can enable monitoring with Prometheus, tracing with Jaeger, logging with Fluentd, configuration with K8s’ API or etcd, as well as enable advanced dns features and integrations.
Linkerd (Incubator) – Linkerd is an open source network proxy designed to be deployed as a service mesh, which is a dedicated layer for managing, controlling, and monitoring service-to-service communication within an application. Linkerd helps developers run microservices at scale by improving an application’s fault tolerance via the programmable configuration of circuit braking, rate limiting, timeouts and retries without application code change. It also provides visibility into microservices via distributed tracing with Zipkin. Finally, it provides advanced traffic control instrumentation to enable Canaries, Staging, Blue-green deployments. SecOps teams will appreciate the capability of Linkerd to transparently encrypt all cross-node communication in a Kubernetes cluster via TLS. Linkerd is built on top of Twitter’s Finagle project, which has extensive production usage and attracts the interest of many companies exploring Service Meshes. Today Linkerd can be used with Kubernetes, DC/OS and AWS/ECS. The Linkerd service mesh is deployed on Kubernetes as a DaemonSet, meaning it is running one Linkerd pod on each node of the cluster.
Envoy (Incubator) – Envoy is a modern edge and service proxy designed for cloud native applications. It is a vendor agnostic, high performance, lightweight (written in C++) production grade proxy that was developed and battle tested at Lyft. Envoy is now a CNCF incubating project. Envoy provides fault tolerance capabilities for microservices (timeouts, security, retries, circuit breaking) without having to change any lines of existing application code. It provides automatic visibility into what’s happening between microservice via integration with Prometheus, Fluentd, Jaeger and Kiali. Envoy can be also used as an edge proxy (e.g. L7 Ingress Controller for Kubernetes) due to its capabilities performing traffic routing and splitting as well as zone-aware load balancing with failovers.
While the service proxy landscape already has many options, Envoy is a great addition that has sparked a lot of interest and revolutionary ideas around service meshes and modern load-balancing. Heptio announced project Contour, an Ingress controller for Kubernetes that works by deploying the Envoy proxy as a reverse proxy and load balancer. Contour supports dynamic configuration updates and multi-team Kubernetes clusters with the ability to limit the Namespaces that may configure virtual hosts and TLS credentials as well as provide advanced load balancing strategies. Another project that uses Envoy at its core is Datawires Ambassador – a powerful Kubernetes-native API Gateway. Since Envoy was written in C++, it is a super lightweight and perfect candidate to run in a sidecar pattern inside Kubernetes and, in combination with its API-driven config update style, has become a perfect candidate for service mesh dataplanes. First, the service mesh Istio announced Envoy to be the default service proxy for its dataplane, where envoy proxies are deployed alongside each instance inside Kubernetes using a sidecar pattern. It creates a transparent service mesh that is controlled and configured by Istio’s Control Plane. This approach compares to the DaemonSet pattern used in Linkerd v1 that provides visibility to each service as well as the ability to create a secure TLS for each service inside Kubernetes or even Hybrid Cloud scenarios. Recently Hashicorp announced that its open source project Consul Connect will use Envoy to establish secure TLS connections between microservices.
Today Envoy has large and active open source community that is not driven by any vendor or commercial project behind it. If you want to start using Envoy, try Istio, Ambassador or Contour or join the Envoy community at Kubecon (Seattle, WA) on December 10th 2018 for the very first EnvoyCon.
Falco (Sandbox) – Falco is an open source runtime security tool developed by Sysdig. It was designed to detect anomalous activity and intrusions in Kubernetes-orchestrated systems. Falco is more an auditing tool than an enforcement tool (such as SecComp or AppArmor). It is run in user space with the help of a Sysdig kernel module that retrieves system calls.
Falco is run inside Kubernetes as a DaemonSet with a preconfigured set of rules that define the behaviours and events to watch out for. Based on those rules, Falco detects and adds alerts to any behaviour that makes Linux system calls (such as shell runs inside containers or binaries making outbound network connections). These events can be captured at STDERR via Fluentd and then sent to ElasticSearch for filtering or Slack. This can help organizations quickly respond to security incidents, such as container exploits and breaches and minimize the financial penalties posed by such incidents.
With the addition of Falco to the CNCF sandbox, we hope that there will be closer integrations with other CNCF projects in the future. To start using Falco, find an official Helm Chart.
Spiffe (Sandbox) – Spiffe provides a secure production identity framework. It enables communication between workloads by verifying identities. It’s policy-driven, API-driven, and can be entirely automated. It’s a cloud native solution to the complex problem of establishing trust between workloads, which becomes difficult and even dangerous as workloads scale elastically and become dynamically scheduled. Spiffe is a relatively new project, but it was designed to integrate closely with Spire.
Spire (Sandbox) – Spire is Spiffe’s runtime environment. It’s a set of software components that can be integrated into cloud providers and middleware layers. Spire has a modular architecture that supports a wide variety of platforms. In particular, the communities around Spiffe and Spire are growing very quickly. HashiCorp just announced support for Spiffe IDs in Vault, so it can be used for key material and rotation. Spiffe and Spire are both currently in the sandbox.
Tuf (Incubator) – Tuf is short for ‘The Update Framework’. It is a framework that is used for trusted content distribution. Tuf helps solve content trust problems, which can be a major security problem. It helps validate the provenance of software and verify that it only the latest version is being used. TUF project play many very important roles within the Notary project that is described below. It is also used in production by many companies that include Docker, DigitalOcean, Flynn, Cloudflare, and VMware to build their internal tooling and products.
Notary (Incubator) – Notary is a secure software distribution implementation. In essence, Notary is based on TUF and ensures that all pulled docker images are signed, correct and untampered version of an image at any stage of you CI/CD workflow, which is one of the major security concerns for Docker-based deployments in Kubernetes systems. Notary publishes and manages trusted collections of content. It allows DevOps engineers to approve trusted data that has been published and create signed collections. This is similar to the software repository management tools present in modern Linux systems, but for Docker images. Some of Notary’s goals include guaranteeing image freshness (always having up-to-date content so vulnerabilities are avoided), trust delegation between users or trusted distribution over untrusted mirrors or transport channels. While Tuf and Notary are generally not used by end users, their solutions integrate into various commercial products or open source projects for content signing or image signing of trusted distributions, such as Harbor, Docker Enterprise Registry, Quay Enterprise, Aqua. Another interesting open-source project in this space Grafeas is an open source API for metadata, which can be used to store “attestations” or image signatures, which can then be checked as part of admission control and used in products such as Container Analysis API and binary authorization at GCP, as well products of JFrog and AquaSec.
Open Policy Agent (Sandbox) – By enforcing policies to be specified declaratively, Open Policy Agent (OPA) allows different kinds of policies to be distributed across a technology stack and have updates enforced automatically without being recompiled or redeployed. Living at the application and platform layers, OPA runs by sending queries from services to inform policy decisions. It integrates well with Docker, Kubernetes, Istio, and many more.
Streaming and Messaging
NATS (Incubator) – NATS is a messaging service that focuses on middleware, allowing infrastructures to send and receive messages between distributed systems. Its clustering and auto-healing technologies are HA, and its log-based streaming has guaranteed delivery for replaying historical data and receiving all messages. NATS has a relatively straightforward API and supports a diversity of technical use cases, including messaging in the cloud (general messaging, microservices transport, control planes, and service discovery), and IoT messaging. Unlike the solutions for logging, monitoring, and tracing listed above, NATS works at the application layer.
gRPC (Incubator) – A high-performance RPC framework, gRPC allows communication between libraries, clients and servers in multiple platforms. It can run in any environment and provide support for proxies, such as Envoy and Nginx. gRPC efficiently connects services with pluggable support for load balancing, tracing, health checking, and authentication. Connecting devices, applications, and browsers with back-end services, gRPC is an application level tool that facilitates messaging.
CloudEvents (Sandbox) – CloudEvents provides developers with a common way to describe events that happen across multi-cloud environments. By providing a specification for describing event data, CloudEvents simplifies event declaration and delivery across services and platforms. Still in Sandbox phase, CloudEvents should greatly increases the portability and productivity of an application.
The cloud native ecosystem is continuing to grow at a fast pace. More projects will be adopted into the Sandbox in the close future, giving them chances of gaining community interest and awareness. That said, we hope that infrastructure-related projects like Vitess, NATs, and Rook will continuously get attention and support from CNCF as they will be important enablers of Cloud Native deployments on Prem. Another area that we hope the CNCF will continue to place focus on is Cloud Native Continuous Delivery where there is currently a gap in the ecosystem.
While the CNCF accepts and graduates new projects, it is also important to have a working mechanism of removal of projects that have lost community interest because they cease to have value or are replaced other, more relevant projects. While project submission process is open to anybody, I hope that the TOC committee will continue to only sponsor the best candidates, making the CNCF a diverse ecosystem of projects that work well with each other.
As a CNCF ambassador, I hope to teach people how to use these technologies. At CloudOps I lead workshops on Docker and Kubernetes that provide an introduction to cloud native technologies and help DevOps teams operate their applications. I also organize Kubernetes and Cloud Native meetups that bring in speakers from around the world and represent a variety of projects. They are run quarterly in Montreal, Ottawa, Toronto, Kitchener-Waterloo, and Quebec City. I would also encourage people to join the Ambassador team at CloudNativeCon North America 2018 on December 10th. Reach out to me @archyufa or email CloudOps to learn more about becoming cloud native.
Developing and deploying applications that communicate in distributed systems, especially in cloud computing, is complex. Messaging has evolved to address the general needs of distributed applications but hasn’t gone far enough. We need a messaging system that takes the next steps to address cloud, edge, and IoT needs. These include ever-increasing scalability requirements in terms of millions, if not billions of endpoints, a new emphasis toward resiliency of the system as a whole over individual components, end-to-end security, and the ability to have a zero-trust system. In this post we’ll discuss the steps NATS is taking to address these needs, leading toward a securely connected world.
Let’s break down the challenges into scalability, resiliency at scale, and security.
To support millions, or even billions of endpoints spanning the globe, most architectures would involve a federated approach with many layers filtering up to a control layer, driven by a required central authority for configuration and security. Instead, NATS is taking a distributed and decentralized approach.
We are introducing a resilient, self-healing way to connect NATS clusters together – NATS superclusters (think clusters of clusters) that optimize data flow. If one NATS server cluster can handle millions of clients, imagine hundreds of them connected to support truly global connectivity with no need for a central authority or hub – no single point of failure.
Traditional messaging systems require precise knowledge of a server cluster topology. This was acceptable in the past as there was no requirement otherwise. However, this severely hinders scalability in a hybrid cloud environment, and beyond to edge and IoT. NATS addresses this with the cloud-native feature of auto-discovery. NATS servers share topology by automatically exchanging cluster information with each other. When scaling upwards and adding a NATS server to an existing cluster, topology information is automatically shared, and boom – each server has complete knowledge of the cluster in real-time with zero configuration changes. And it gets better – clients also receive this topology information. NATS maintainer supported clients use this to keep an up-to-date list of available servers, enabling clients to connect to new servers and avoid connecting to servers that are no longer available – again with no configuration changes. View this feature in the context of scaling up (or down), rolling upgrades, or cruising past instance failures and this is an operator’s dream.
Resiliency at Scale
In messaging, priorities have changed with the move from traditional on-premise computing to cloud computing: The needs of the system as a whole must be prioritized over individual components in the system. Messaging systems evolved to suit predictable and well-defined systems. We knew exactly what and where our hardware resources were to accurately identify, predict, and shore up weak points of the system. While cloud vendors have done a great job, you cannot count on the same levels of predictability unless you are willing to pay for dedicated hardware.
Today we have machine instances disappearing (often intentionally), spurious network issues, or even unpredictable availability of CPU cycles on multi-tenant instances. Add to this the decomposition of applications into microservices and we have many more moving parts in a distributed system – and many more places to fail.
Unhealthy servers or clients that can’t keep up are known to NATS as slow consumers. The NATS messaging system will not assist slow consumers. Instead, they are disconnected, protecting the system as a whole. Server clusters self-heal, and clients might be restarted or simply reconnect elsewhere. If using NATS streaming, missed messages are redelivered.
This works so well that we have users who deploy on spot instances knowing the instances will terminate at some point, counting on the holistically resilient behavior of NATS to reduce their operational costs.
The NATS team aims to enable the creation of a securely connected world where data transfer is protected end to end and metadata is anonymous.NATS is taking a novel approach to solving this.
Concerning identities, we’ll continue to support user/password authentication, but our preference will be a new approach using Nkeys – a form of an Ed25519 key made extremely simple.
NKeys are fast and resistant to side channel attacks. Public Nkeys may be registered with the NATS server and during connect, the endpoint (client application) signs a nonce with its private key and returns it to the server where the identity can be verified. The NATS messaging system will never have access or store private keys. Operators or organizations manage their own private keys. In this new model, authorization to publish and receive data are tied to an NKey identity.
NKeys may be assigned to accounts, which are securely isolated communication contexts that allow multi-tenancy. Servers may be configured with multiple accounts, to support true multi-tenancy, bifurcating topology from the data flow. The decision to silo data is driven by use case, rather than software limitations, and operators only need to maintain one NATS deployment. An account can span several NATS server clusters (enterprise multi-tenancy), be limited to just one cluster (data silo by cluster), exist within a subset of a cluster or even in a single server.
Between accounts, specific data can be shared or exchanged through a service (request/reply) or a stream (publish/subscribe). Mutual agreement between accounts allows data flow, allowing for decentralized account management.
Of course, NATS will continue to support TLS connections, but applications can go a step further by using NKeys to sign payloads.
Connecting It All Together
These features, along with other work the NATS team is doing behind scalability, reliability at scale, and security will allow NATS to provide secure and decentralized global connectivity. We’ll be discussing these in-depth at KubeCon + CloudNativeCon North America in December, and have three sessions – we’d love it if you can attend. We always enjoy a good conversation around solving hard problems like this and invite you to stop by to visit the NATS maintainers at the Synadia booth to learn more.