KubeCon + CloudNativeCon Amsterdam | August 13-16, 2020 | Don’t Miss Out | Learn more

Category

Blog

TOC Votes to Move Dragonfly into CNCF Incubator

By | Blog

Today, the CNCF Technical Oversight Committee (TOC) voted to accept Dragonfly as an incubation-level hosted project.

Dragonfly, which was accepted into the CNCF Sandbox in October 2018, is an open source, cloud native image and file distribution system. Dragonfly was created in June 2015 by Alibaba Cloud to improve the user experience of image and file distribution in Kubernetes. This allows engineers in enterprises to focus on the application itself rather than infrastructure management.

“Dragonfly is one of the backbone technologies for container platforms within Alibaba’s ecosystem, supporting billions of application deliveries each year, and in use by many enterprise customers around the world,” said, Li Yi, senior staff engineer, Alibaba. “Alibaba looks forward to continually improving Dragonfly, making it more efficient and easier to use.”

The goal of Dragonfly is to tackle distribution problems in cloud native scenarios. The project is comprised of three main components: supernode plays the role of central scheduler and controls all distribution procedure among the peer network; dfget resides on each peer as an agent to download file pieces; and “dfdaemon” plays the role of proxy which intercepts image downloading requests from container engine to dfget.

“Dragonfly improves the user experience by taking advantage of a P2P image and file distribution protocol and easing the network load of the image registry,” said Sheng Liang, TOC member and project sponsor. “As organizations across the world migrate their workloads onto container stacks, we expect the adoption of Dragonfly to continue to increase significantly.”

Dragonfly integrates with other CNCF projects, including Prometheus, containerd, Habor, Kubernetes, and Helm. Project maintainers come from Alibaba, ByteDance, eBay, and Meitu, and there are more than 20 contributing companies, including NetEase, JD.com, Walmart, VMware, Shopee, ChinaMobile, Qunar, ZTE, Qiniu, NVIDIA, and others. 

Main Dragonfly Features:

  • P2P based file distribution: Using P2P technology for file transmission, which can make full use of the bandwidth resources of each peer to improve download efficiency, saves a lot of cross-IDC bandwidth, especially costly cross-board bandwidth.
  • Non-invasive support for all kinds of container technologies: Dragonfly can seamlessly support various containers for distributing images.
  • Host level speed limit: Many downloading tools (wget/curl) only have rate limit for the current download task, but dragonfly also provides a rate limit for the entire host.
  • Passive CDN: The CDN mechanism can avoid repetitive remote downloads.

Notable Milestones:

  • 7 project maintainers from 4 organizations
  • 67 contributors
  • 21 contributing organizations 
  • 4.6k + GitHub stars
  • 100k + downloads in Docker Hub
  • 120% increase in commits last year

Since it joined the CNCF sandbox, Dragonfly has grown rapidly across industries including e-commerce, telecom, financial, internet, and more. Users include organizations like Alibaba, China Mobile, Ant Financial, Huya, Didi, iFLYTEK, and others. 

“As cloud native adoption continues to grow, the distribution of container images in large scale production environments becomes an important challenge to tackle, and we are glad that Dragonfly shares some of those initial lessons learned at Alibaba,” said Chris Aniszczyk, CTO/COO of CNCF. “The Dragonfly project has made a lot of strides recently as it was completely rewritten in Golang for performance improvements, and we look forward to cultivating and diversifying the project community.”

In its latest version, Dragonfly 1.0.0, the project has been completely rewritten in Golang to improve ease of use with other cloud native technologies. Now Dragonfly brings a more flexible and scalable architecture, more cloud scenarios, and a potential integration with OCI (Open Container Initiative) to make image distribution more efficient.

“We are very excited for Dragonfly to move into incubation,” said Allen Sun, staff engineer at Alibaba and Dragonfly project maintainer. “The maintainers have been working diligently to improve on all aspects of the project, and we look forward to seeing what this next chapter will bring.”

As a CNCF hosted project, joining incubating technologies like OpenTracing, gRPC, CNI, Notary, NATS, Linkerd, Helm, Rook, Harbor, etcd, OPA, CRI-O, TiKV, CloudEvents, Falco, and Argo, Dragonfly is part of a neutral foundation aligned with its technical interests, as well as the larger Linux Foundation, which provides governance, marketing support, and community outreach.

Every CNCF project has an associated maturity level: sandbox, incubating, or graduated. For more information on maturity requirements for each level, please visit the CNCF Graduation Criteria v.1.3.

Learn more about Dragonfly, visit https://github.com/dragonflyoss/Dragonfly.

Serverless Open-Source Frameworks: OpenFaaS, Knative, & More

By | Blog

Member Blog Post

Originally published on the Epsagon blog by Ran Ribenzaft, co-founder and CTO at Epsagon

This article will discuss a few of the frameworks mentioned above and will go deep into OpenFaaS and Knative to present their architecture, main components, and basic installation steps. If you are interested in this topic and plan to develop serverless applications using open-source platforms, this article will give you a better understanding of these solutions.

Over the past few years, serverless architectures have been rapidly gaining in popularity. The main advantage of this technology is the ability to create and run applications without the need for infrastructure management. In other words, when using a serverless architecture, developers no longer need to allocate resources, scale and maintain servers to run applications, or manage databases and storage systems. Their sole responsibility is to write high-quality code.

There have been many open-source projects for building serverless frameworks (Apache OpenWhisk, IronFunctions, Fn from Oracle, OpenFaaS, Kubeless, Knative, Project Riff, etc). Moreover, due to the fact that open-source platforms provide access to IT innovations, many developers are interested in open-source solutions.

OpenWhisk, Firecracker & Oracle FN

Before delving into OpenFaaS and Knative, let’s briefly describe these three platforms.

Apache OpenWhisk is an open cloud platform for serverless computing that uses cloud computing resources as services. Compared to other open-source projects (Fission, Kubeless, IronFunctions), Apache OpenWhisk is characterized by a large codebase, high-quality features, and the number of contributors. However, the overly large tools for this platform (CouchDB, Kafka, Nginx, Redis, and Zookeeper) cause difficulties for developers. In addition, this platform is imperfect in terms of security.

Firecracker is a virtualization technology introduced by Amazon. This technology provides virtual machines with minimal overhead and allows for the creation and management of isolated environments and services. Firecracker offers lightweight virtual machines called micro VMs, which use hardware-based virtualization technologies for their full isolation while at the same time providing performance and flexibility at the level of conventional containers. One of the inconveniences for developers is that all the developments of this technology are written in the Rust language. A truncated software environment with a minimum set of components is also used. To save memory, reduce startup time, and increase security in environments, a modified Linux kernel is launched from which all the superfluous things have been excluded. In addition, functionality and device support are reduced. The project was developed at Amazon Web Services to improve the performance and efficiency of AWS Lambda and AWS Fargate platforms.

Oracle Fn is an open-server serverless platform that provides an additional level of abstraction for cloud systems to allow for Functions as Services (FaaS). As in other open platforms in Oracle Fn, the developer implements the logic at the level of individual functions. Unlike existing commercial FaaS platforms, such as Amazon AWS Lambda, Google Cloud Functions, and Microsoft Azure Functions, Oracle’s solution is positioned as having no vendor lock-in. The user can choose any cloud solution provider to launch the Fn infrastructure, combine different cloud systems, or run the platform on their own equipment.

Kubeless is an infrastructure that supports the deployment of serverless functions in your cluster and enables us to execute both HTTP and event switches in your Python, Node.js, or Ruby code. Kubeless is a platform that is built using Kubernetes’ core functionality, such as deployment, services, configuration cards (ConfigMaps), and so on. This saves the Kubeless base code with a small size and also means that developers do not have to replay large portions of the scheduled logic code that already exists inside the Kubernetes kernel itself.

Fission is an open-source platform that provides a serverless architecture over Kubernetes. One of the advantages of Fission is that it takes care of most of the tasks of automatically scaling resources in Kubernetes, freeing you from manual resource management. The second advantage of Fission is that you are not tied to one provider and can move freely from one to another, provided that they support Kubernetes clusters (and any other specific requirements that your application may have).

Main Benefits of Using OpenFaaS and Knative

OpenFaaS and Knative are publicly available and free open-source environments for creating and hosting serverless functions. These platforms allow you to:

  • Reduce idle resources.
  • Quickly process data.
  • Interconnect with other services.
  • Balance load with intensive processing of a large number of requests.

However, despite the advantages of both platforms and serverless computing in general, developers must assess the application’s logic before starting an implementation. This means that you must first break the logic down into separate tasks, and only then can you write any code.

For clarity, let’s consider each of these open-source serverless solutions separately.

How to Build and Deploy Serverless Functions With OpenFaaS

The main goal of OpenFaaS is to simplify serverless functions with Docker containers, allowing you to run complex and flexible infrastructures.

OpenFaas Design & Architecture

OpenFaaS architecture is based on a cloud-native standard and includes the following components: API Gateway, Function Watchdog, and the container orchestrators Kubernetes, Docker Swarm, Prometheus, and Docker. According to the architecture shown below, when a developer works with OpenFaaS, the process begins with the installation of Docker and ends with the Gateway API.

OpenFaaS Architecture Diagram

OpenFaaS Components and Process

API Gateway

Through the API Gateway, a route to the location of all functions is provided, and cloud-native metrics are collected through Prometheus.

Docker cluster API Gateway

Client to Functions Routing

Function Watchdog

A Watchdog component is integrated into each container to support a serverless application and provides a common interface between the user and the function.

OpenFaaS Watchdog Interface

OpenFaaS Watchdog Interface

One of the main tasks of Watchdog is to organize an HTTP request received on the API Gateway and call the selected application.

Prometheus

This component allows you to get the dynamics of metric changes at any time, compare them with others, convert them, and view them in text format or in the form of a graph without leaving the main page of the web interface. Prometheus stores the collected metrics in RAM and saves them to a disk upon reaching a given size limit or after a certain period of time.

Docker Swarm and Kubernetes

Docker Swarm and Kubernetes are the engines of orchestration. Components such as the API Gateway, the Function Watchdog, and an instance of Prometheus work on top of these orchestrators. It is recommended to use Kubernetes to develop products, while Docker Swarm is better to create local functions.

Moreover, all developed functions, microservices, and products are stored in the Docker container, which serves as the main OpenFaaS platform for developers and sysadmins to develop, deploy, and run serverless applications with containers.

The Main Points for Installation of OpenFaaS on Docker

The OpenFaaS API Gateway relies on the built-in functions provided by the selected Docker orchestrator. To do this, the API Gateway connects to the appropriate plugin for the selected orchestrator, records various function metrics in Prometheus, and scales functions based on alerts received from Prometheus through AlertManager.

For example, say you are working on a machine with Linux OS and want to write one simple function on one node of a Docker cluster using OpenFaaS. To do this, you would just follow the steps below:

  • Install Docker CE 17.05 or the latest version.
  • Run Docker:
$ docker run hello-world
  • Initialize Docker Swarm:
$ docker swarm init
  • Clone OpenFaaS from Github:

git clone https://github.com/openfaas/faas && \  cd faas && \ ./deploy_stack.sh

  • Login to UI portal at http://127.0.0.1:8080.

Docker is now ready for use, and you no longer have to install it when writing further functions.

Preparing CLI OpenFaaS for Building Functions

To develop a function, you need to install the latest version of the command line using a script. For brew, this would be $ brew install faas-cli. For curl, you would use $ curl -sL https://cli.get-faas.com/ | sudo sh.

Different Program Languages With OpenFaas

To create and deploy a function with OpenFaaS using templates in the CLI, you can write a handler in almost any programming language. For example:

  • Create new function:
$ faas-cli new --lang prog language <<function name>>
  • Generate stack file and folder:
$ git clone https://github.com/openfaas/faas \
 cd faas \
 git checkout 0.6.5 \
 ./deploy_stack.sh
  • Build the function:
$ faas-cli build -f <<stack file>>
Deploy the function:
$ faas-cli deploy -f <<stack file>>

Testing the Function From OpenFaaS UI

You can quickly test the function in several ways from the OpenFaas user interface, as shown below:

  • Go to OpenFaaS UI:

http://127.0.0.1:8080/ui/

  • Use curl:
$ curl -d "10" http://localhost:8080/function/fib
  • Use the UI

At first glance, everything probably seems quite simple. However, you still have to deal with many nuances. This is especially the case if you have to work with Kubernetes, require many functions, or need to add additional dependencies to the FaaS main code base.

There is an entire community of OpenFaas developers on GitHub where you can find useful information as well.

Benefits and Disadvantages of OpenFaaS

OpenFaaS simplifies the building of the system. Fixing errors becomes easier, and adding new functionality to the system is much faster than in the case of a monolithic application. In other words, OpenFaaS allows you to run code in any programming language anytime and anywhere.

However, there are drawbacks:

  • Lengthy cold-start time for some programming languages.
  • Container startup time depends on the provider.
  • Limited lifetime of the function, meaning not all systems can work according to Serverless. (When using OpenFaaS, computing containers cannot store executable application code in memory for a long time. The platform will create and destroy them automatically, so a stateless state is not possible.)

Deploying and Running Functions With Knative

Knative allows you to develop and deploy container-based server applications that you can easily port between cloud providers. Knative is an open-source platform that is just starting to gain popularity but is of great interest to developers today.

Architecture and Components of Knative

The Knative architecture consists of the Building, Eventing, and Serving components.

Knative Architecture

Knative Architecture and Components

Building

The Building component of Knative is responsible for ensuring that container assemblies in the cluster are launched from the source code. This component works on the basis of existing Kubernetes primitives and also extends them.

Eventing

The Eventing component of Knative is responsible for universal subscription, delivery, and event management as well as the creation of communication between loosely coupled architecture components. In addition, this component allows you to scale the load on the server.

Serving

The main objective of the Serving component is to support the deployment of serverless applications and features, automatic scaling from scratch, routing and network programming for Istio components, and snapshots of the deployed code and configurations. Knative uses Kubernetes as the orchestrator, and Istio performs the function of query routing and advanced load balancing.

Example of the Simplest Functions With Knative

You can use several methods to create a server application on Knative. Your choice will depend on your given skills and experience with various services including Istio, Gloo, Ambassador, Google, and especially Kubernetes Engine, IBM Cloud, Microsoft Azure Kubernetes Service, Minikube, and Gardener.

Simply select the installation file for each of the Knative components. Links to the main installation files for the three components required can be found here below:

Serving Component

https://github.com/knative/serving/releases/download/v0.7.0/serving.yaml

https://github.com/knative/serving/releases/download/v0.7.0/monitoring.yaml

Building Component

https://github.com/knative/build/releases/download/v0.7.0/build.yaml

Eventing Component

https://github.com/knative/eventing/releases/download/v0.7.0/release.yaml

https://github.com/knative/eventing/releases/download/v0.7.0/eventing.yaml

Each of these components is characterized by a set of objects. More detailed information about the syntax and installation of these components can be found on Knative’s own development site.

Benefits and Disadvantages of Knative

Knative has a number of benefits. Like OpenFaaS, Knative allows you to create serverless environments using containers. This in turn allows you to get a local event-based architecture in which there are no restrictions imposed by public cloud services. Knative also lets you automate the container assembly process, which provides automatic scaling. Because of this, the capacity for serverless functions is based on predefined threshold values ​​and event-processing mechanisms.

In addition, Knative, allows you to create applications internally, in the cloud, or in a third-party data center. This means that you are not tied to any one cloud provider. And due to its operation being based on Kubernetes and Istio, Knative has a higher adoption rate and greater adoption potential.

One main drawback of Knative is the need to independently manage container infrastructure. Simply put, Knative is not aimed at end users. However, because of this, more commercially managed Knative offers are becoming available, such as the Google Kubernetes Engine and Managed Knative for the IBM Cloud Kubernetes Service.

Conclusion

Despite the growing number of open-source serverless platforms, OpenFaaS and Knative will continue to gain popularity among developers. It is worth noting that these platforms can not be easily compared because they are designed for different tasks. 

Unlike OpenFaas, Knative is not a full-fledged serverless platform, but it is better positioned as a platform for creating, deploying, and managing serverless workloads. However, from the point of view of configuration and maintenance, OpenFaas is simpler. With OpenFaas, there is no need to install all components separately as with Knative, and you don’t have to clear previous settings and resources for new developments if the required components have already been installed.

Still, as mentioned above, a significant drawback of OpenFaaS is that the container launch time depends on the provider, while Knative is not tied to any single cloud solution provider. Based on the pros and cons of both, organizations may also choose to use Knative and OpenFaaS together to effectively achieve different goals.

Cloud native ecosystem empowering new open source deep learning framework

By | Blog

Member Post

By Zhipeng Huang, open source community manager, Mindspore, Huawei 

Hello World, MindSpore

MindSpore[0] is a new open source deep learning training/inference framework from Huawei that could be used for mobile, edge and cloud scenarios.

MindSpore is designed to provide development experience with friendly design and efficient execution for the data scientists and algorithmic engineers, native support for Ascend AI processor, and software hardware co-optimization.

In the meantime, MindSpore as a global AI open source community, aims to further advance the development and enrichment of the AI software/hardware application ecosystem

When Cloud Native Meets AI Newcomer

Learning from the best, MindSpore is also utilizing the cloud native ecosystem for deployment and management. With the recent Kubeflow 1.0 and Kubernetes 1.18 release, we can experiment with the latest cloud native computing technology for agile MLOps.

In order to take advantages of the prowess of Kubeflow and Kubernetes, the first thing we did is to write the operator for MindSpore, i.e ms-operator, and also define a MindSpore CRD. The current version of ms-operator is based on an early version of PyTorch Operator [1] and TF Operator [2].

The implementation of ms-operator contains the specification and implementation of MSJob custom resource definition. We will demonstrate running a walkthrough of making a ms-operator image, creating a simple msjob on Kubernetes with MindSpore “`0.1.0-alpha“` image. The whole MindSpore community is still working on implementing distributed training on different backends so that users can create and manage msjobs like other built-in resources on Kubernetes in the near future.

The MindSpore CRD submitted to the Kubernetes API would look something like this:

```
apiVersion: "kubeflow.org/v1"
kind: "MSJob"
metadata:
  name: "msjob-mnist"
spec:
  backend: "tcp"
  masterPort: "23456"
  replicaSpecs:
    - replicas: 1
      replicaType: MASTER
      template:
        spec:
          containers:
          - image: leonwanghui/mindspore-cpu:0.1.0-alpha
            imagePullPolicy: IfNotPresent
            name: msjob-mnist
                  command: ["/bin/bash", "-c", "python /tmp/test/MNIST/main.py"]
            volumeMounts:
              - name: training-result
                mountPath: /tmp/result
              - name: ms-mnist-local-file
                      mountPath: /tmp/test
          restartPolicy: OnFailure
          volumes:
            - name: training-result
              emptyDir: {}
            - name: entrypoint
              configMap:
                name: dist-train
                defaultMode: 0755
          restartPolicy: OnFailure
    - replicas: 3
      replicaType: WORKER
      template:
        spec:
          containers:
          - image: leonwanghui/mindspore-cpu:0.1.0-alpha
            imagePullPolicy: IfNotPresent
            name: msjob-mnist
                  command: ["/bin/bash", "-c", "python /tmp/test/MNIST/main.py"]
            volumeMounts:
              - name: training-result
                mountPath: /tmp/result
              - name: ms-mnist-local-file
                      hostPath:
                              path: /root/gopath/src/gitee.com/mindspore/ms-operator/examples
          restartPolicy: OnFailure
          volumes:
            - name: training-result
              emptyDir: {}
            - name: entrypoint
              configMap:
                name: dist-train
                defaultMode: 0755
          restartPolicy: OnFailure
```

The MSJob currently is designed based on the TF Job and PyTorch Job, and is subject to change in future versions.

“Backend” defines the protocol the MS workers will use to communicate when initializing the worker group. MindSpore supports heterogeneous computing including multiple hardware backends (CPU, GPU, Ascend, etc.), and the device_target of MindSpore is Ascend by default.

The MindSpore community is driving to collaborate with the Kubeflow community as well as making the ms-operator more complex, well-organized and up-to-date. All these components make it easy for machine learning engineers and data scientists to leverage cloud assets (public or on-premise) for machine learning workloads.

Governance Also Matters

Not only MindSpore benefits from the technological ecosystem of cloud native, it also built the community’s governance modeled after Kubernetes. MindSpore borrowed concepts like Technical Steering Committee, Special Interest Groups and Working Groups to have an open and efficient community governance. The first term TSC members consists of experts from various universities, institutions and startups.[3]

Looking Forward

MindSpore is also looking forward to enabling users to use Jupyter to develop models. Users in the future can use Kubeflow tools like fairing (Kubeflow python SDK) to build containers and create Kubernetes resources to train their MindSpore models.

Once training completed, users can use KFServing[4] to create and deploy a server for inference thus completing the life cycle of machine learning.

Distributed training is another field MindSpore is focusing on. There are two major distributed training strategies nowadays: one based on parameter servers and the other based on collective communication primitives such as allreduce. MPI Operator[5] is one of the core components of Kubeflow which makes it easy to run synchronized, allreduce-style distributed training on Kubernetes. MPI Operator provides a crd for defining a training job on a single CPU/GPU, multiple CPU/GPUs, and multiple nodes. It also implements a custom controller to manage the CRD, create dependent resources, and reconcile the desired states. If MindSpore can leverage MPI Operator together with the high performance Ascend AI processor, it is possible that MindSpore will bring distributed training to an even higher level.

[0] https://gitee.com/mindspore/

[1] https://github.com/kubeflow/pytorch-operator

[2] https://github.com/kubeflow/tf-operator

[3] https://www.mindspore.cn/en/community

[4] https://github.com/kubeflow/kfserving

[5] https://github.com/kubeflow/mpi-operator

*** Zhipeng Huang currently serve as open source community manager for MindSpore. Zhipeng is now the TAC member of LFAI, TAC and Outreach member of the Confidential Computing Consortium, co-lead of the Kubernetes Policy WG, project lead of CNCF Security SIG, founder of the OpenStack Cyborg project, and co-chair of OpenStack Public Cloud WG. Zhipeng is also leading a team in Huawei that works on ONNX, Kubeflow, Akraino, and other open source communities

How Cloud Native Is Enabling Babylon’s Medical AI Innovations

By | Blog

Babylon’s mission is to put accessible and affordable healthcare services in the hands of every person on earth. 

Since its launch in the U.K. in 2013, the startup has facilitated millions of digital consultations around the world, and that’s just the start. “We try to combine different types of technology with the medical expertise that we have in-house to build products that will help patients manage and understand their health, and also help doctors be more efficient at what they do,” says Jérémie Vallée, AI Infrastructure Lead at Babylon. 

A large number of these products leverage machine learning and artificial intelligence, and in 2019, researchers hit a pain point. “We have some servers in-house where our researchers were doing a lot of AI experiments and some training of models, and we came to a point where we didn’t have enough compute in-house to run a particular experiment,” says Vallée. 

Babylon had migrated its user-facing applications to a Kubernetes platform in 2018, “and we had a lot of Kubernetes knowledge thanks to the migration,” he adds. To optimize some of the models that had been created, the team turned to Kubeflow, a toolkit for machine learning on Kubernetes. “We tried to create a Kubernetes core server, we deployed Kubeflow, and we orchestrated the whole experiment, which ended up being a really good success,” he says.

Based on that experience, Vallée’s team was tasked with building a self-service platform to help Babylon’s AI teams become more efficient, and by extension help get products to market faster. “Kubernetes is a great platform for machine learning because it comes with all the scheduling and scalability that you need,” says Vallée. 

The need to keep data in every country in which Babylon operates requires a multi-region, multi-cloud strategy, and some countries might not even have a public cloud provider at all. “We wanted to make this platform portable so that we can run training jobs anywhere,” he says. “Kubernetes offered a base layer that allows you to deploy the platform outside of the cloud provider, and then deploy whatever tooling you need. That was a very good selling point for us.”

Once the team decided to build the Babylon AI Research platform on top of Kubernetes, they referred to the Cloud Native Landscape to build out the stack: Prometheus and Grafana for monitoring; an Istio service mesh to control the network on the training platform and control what access all of the workflows would have; Helm to deploy the stack; and Flux to manage the GitOps part of the pipeline. 

The cloud native AI platform has had a huge impact at Babylon. The first research projects run on the platform mostly involved machine learning and natural language processing. These experiments required a huge amount of compute—1600 CPU, 3.2 TB RAM—which was much more than Babylon had in-house. Plus, access to compute used to take hours, or sometimes even days, depending on how busy the platform team was. “Now, with Kubernetes and the self-service platform that we provide, it’s pretty much instantaneous,” says Vallée.

Another important type of work that’s done on the platform is clinical validation for new applications such as Babylon’s Symptom Checker, which calculates the probability of a disease given the evidence input by the user. “Being in healthcare, we want all of our models to be safe before they’re going to hit production,” says Vallée. Using Argo for GitOps “enabled us to scale the process massively.” 

For more on Babylon’s cloud native journey, read the full case study.

TOC Welcomes Argo into the CNCF Incubator

By | Blog

Project Blog

Today, the CNCF Technical Oversight Committee (TOC) voted to accept Argo as an incubation-level hosted project.

The Argo Project is a set of Kubernetes-native tools for running and managing jobs and applications on Kubernetes. Argo was created in 2017 at Applatix, which was acquired by Intuit in 2018. A few months later, BlackRock contributed Argo Events to the Argo project. Both companies are heavily involved in the development and cultivation of the project and the community.

“Our goal with Argo is to empower organizations to declaratively build and run cloud native applications and workflows on Kubernetes using GitOps,” said Pratik Wadher, VP of product development at Intuit. “We are thrilled the project has been accepted into the CNCF incubator, and we look forward to promoting CNCF’s cloud native mission by fostering collaborative development, and providing closer integration and collaboration with other CNCF projects.”

“Event-based workflows play an integral role in data-driven modeling in BlackRock’s Data Science Platform, enabling investors and users across the firm to access a wealth of financial data using research models,” said Michael Francis, tech fellow and head of platform engineering for BlackRock’s investment technology Aladdin. “We were already using Argo Workflows, and we decided to contribute Argo Events, an event-based dependency manager for Kubernetes to the Argo project.” 

Argo provides an easy way to combine three modes of computing – services, workflows, and event-based – in creating jobs and applications on Kubernetes. All the Argo tools are implemented as controllers and custom resources. They use or integrate with other CNCF projects like gRPC, Prometheus, NATS, Helm, and CloudEvents.

“The Argo Project is well aligned with CNCF’s mission to make cloud native computing ubiquitous, and it will bring a number of benefits to the community,” said Michelle Noorali, TOC member and project sponsor. “The project already has an impressive list of users in production and I am looking forward to seeing what the project will accomplish as it comes under the CNCF umbrella.”

Argo is actively used in production by over 100 organizations, including Adobe, Alibaba Cloud, Data Dog, Datastax, Google, GitHub, IBM, NVIDIA, SAP, Tesla, Ticketmaster, and Volvo.

Argo consists of four sub-projects, including:

  • Argo Workflows – Container native workflow engine for Kubernetes supporting both DAG and step-based workflows.
  • Argo Events – Events-based dependency manager for Kubernetes. 
  • Argo CD –  Support for declarative GitOps-based deployment of any Kubernetes resource, including Argo Events, services, and deployments across multiple k8s clusters.
  • Argo Rollouts – Support for declarative progressive delivery strategies such as canary, blue-green, and more general forms of experimentation.

Notable Milestones:

  • 8,300 GitHub stars
  • 2,800 Slack members
  • 425 contributors
  • 4,000+ commits
  • 110 end-users
  • 200+ releases

In joining CNCF, the Argo team will continue to grow the Argo community by focusing on the continuous and progressive delivery of microservice and machine learning applications (MLOps) on Kubernetes.

“Given the team’s work in simplifying the use of Kubernetes and enabling GitOps, Argo fits right in with the CNCF community,” said Chris Aniszczyk, CTO/COO of CNCF. “We are excited to cultivate the community under CNCF and look forward to enabling collaboration and coordination with sister projects such as Flux.”

As a CNCF hosted project, joining incubating technologies like OpenTracing, gRPC, CNI, Notary, NATS, Linkerd, Helm, Rook, Harbor, etcd, OPA, CRI-O, TiKV, CloudEvents, and Falco, Argo is part of a neutral foundation aligned with its technical interests, as well as the larger Linux Foundation, which provides governance, marketing support, and community outreach.

Every CNCF project has an associated maturity level: sandbox, incubating, or graduated. For more information on maturity requirements for each level, please visit the CNCF Graduation Criteria v.1.3.

To learn more about Argo, please visit https://github.com/argoproj.

EnterpriseAI: “Kubernetes Tools Keep Coming”

By | Blog

The Cloud Native Computing Foundation (CNCF) said it has accepted Project Argo as a hosted incubator project, the interim step toward “graduation.” Argo was launched in 2017 by the Bay Area startup Applatix, which had been developing a “DevOps in a box” application. Intuit NASDAQ: INTU), the financial software specialist, acquired Applatix in 2018. READ MORE

We’re all in this Together: A Wellness Guide from the CNCF Well-Being Working Group

By | Blog

This article was contributed by: Chris Lentricchia, Sara E. Davila, Rin Oliver, Jennifer Lankford, Andrew Randall, Shea Stewart, and Dave McAllister.

The ongoing situation surrounding COVID-19 and social distancing as much of a mental health issue as it is a viral one. With everything that’s going on, it’s normal to have concerns right now. To that end, members of the CNCF Well-Being Working Group have pulled together some common questions about balancing work with our current situation as well as some resources to help our community ensure their continued mental and physical well-being. We’ve also included ways that you can help while still practicing social distancing. While this list certainly isn’t exhaustive, we hope that by starting to curate a list of frequently asked questions, resources, and suggestions, we can begin to learn to cope with the mental health aspects of COVID-19 and our new world of social distancing, together.

I’m feeling a little down today, what can I do to help myself?

It’s okay to not feel okay. Notice those feelings, and give them permission to be there. We don’t have to be happy all the time; we’re humans and sometimes humans aren’t happy. Don’t rush, and enjoy the “little” things in life, like good coffee, fresh air, and life itself. Although feeling down is normal, remember that happiness is a choice and that nothing can make us feel bad without our approval. There’s always something to be grateful for. 

 I don’t want to complain because I know I’m privileged. Others have it worse than me.

There’s an important difference between complaining and pointing out that you’re feeling a specific type of way. While we should be careful not to make this event about us, telling someone that you’re feeling a specific way right now is human and normal.

 What if I’m not being productive at work?

Give yourself permission to not be productive right now if that’s what you need. This is a complicated time and a lot of us aren’t mentally engaged. This is also a fantastic time to learn how to get away from the “cult of productivity”, where we push ourselves to become more productive until we burn out. Step back, take a deep breath, and do something that makes you feel good.

What about my boss? I have to show them that I’m doing something, right?

To an extent, yes. You shouldn’t be expected to be as productive as you normally would. Even if you normally WFH, this is still an odd period of time. While taking care of yourself, be mindful that the current situation is affecting everyone differently. Keep in mind that your boss may be experiencing their own stressors.

I’m new to working from home. I see all these pieces of advice from people. Will that work for me?

Maybe, or maybe not. All human beings are different (that’s what makes us amazing). You can try some of the tips you see, but you don’t necessarily have to try them. Something that works for someone else may not necessarily work for you too – and that’s completely okay.

I have someone that is messaging me constantly and panicking; they’re inhibiting my ability to take care of myself/focus! I feel like I’m being selfish to turn them off.

You can’t take care of others if you don’t take care of yourself first. Give yourself permission to advocate for your own needs. While it may not be possible to protect the world from itself, it is possible to protect yourself from the world. Put an emphasis on making the right choices; by not responding to panic, you may be doing the other person a favor – even if they don’t know it. 

I’m kind of lonely right now

That’s understandable. If you’re in need of social interaction, you could try having digital coffee hours with your friends or co-workers, or you could try reaching out to someone that you haven’t in a long time just to catch up. Now is a great time to let someone else know how much they mean to you, too. The most important thing is to find what makes you happy and do it.

Working from home with my family is stressing

Gone are the days when you were trying to have everyone in the same room, as now they are always there! It can be difficult to find enough time to focus on work or yourself with your spouse and/or kids at home. It’s very likely that your family and work schedule needs to adjust to accommodate everyone in the house. Clear communication to your family and your coworkers about when you intend to work and when you intend to focus on family will go a long way towards others respecting your time and space. 

I’m concerned about my job status

That’s a completely valid concern. Now is an excellent time to have that conversation with your management.

The news is making me stressed out

If you’re feeling stressed out, it might be a good time to consider stepping away. In the 24-hour news cycle, you can choose to consciously limit consumption – instead of checking headlines constantly throughout the day, have a news catch-up in the morning and then again in the evening. Most of the major news we see today, while being interesting, is not actionable. For example, knowing that the Dow Jones is taking a nosedive may be interesting to us, but it’s information that you are likely not be able to do anything with. At most the information can be seen as neutral and at worst, harmful. To that end, updating yourself once or twice per day is enough to stay informed while still protecting yourself from harm. 

This is all coming at an interesting time. I was thinking about [insert big decision here]

This isn’t a great time to make big decisions if you can help it. Provided that you have no timeline for your decision, take advantage of this opportunity to think through any decisions you may be taking without taking action on them. If you do have to make a large decision soon, talk it through with trusted friends and family before acting.

Exercise used to be my go to stress reliever and now I can’t go to the gym.

Consider using this time to build your home practice or trying something new. Physical practices like yoga, mindfulness, breath work or meditation could calm your fears. And if yoga isn’t for your, many global fitness brands have pivoted to providing some sort of virtual class. Check in with your local studio or gym and see what they are offering online. Also check in with your social circle and see if you can create a virtual challenge together to help stay motivated.

Mental Health Resources:

Sanvello (Formerly Pacifica Health) is currently making its premium services free to all users. “All content, coping tools, and peer support—is completely free during the COVID-19 crisis.”

Headspace is a mental health app that can be useful in daily life, as well as during COVID-19. Headspace is offering free support during the COVID-19 situation.

The Daily Stoic is a journal, newsletter, and website that documents consumable lessons from Stoic philosophers.

Tiny Care Bot is a twitter bot that tweets periodic healthy reminders.

Youper is an Emotional Health Assistant that applies Artificial Intelligence to monitor and improve your emotional health. It helps you feel your best with quick conversations based on various psychological techniques personalized to your needs and style.

How You Can Help Yourself and Others:

– Reach out to friends and loved ones to check in on them. Video chat is a great way to maintain or build on those personal connections.

Stay at home and maintain social distancing.

Donate blood to organizations in your community, keeping in mind that you may be able to donate directly to your local hospital.

Listen to others. Be compassionate, mindful, and kind

– Show gratefulness to others; everyone’s trying to do their part and it feels good to give and receive words of affirmation

– Be mindful of the spread of misinformation and panic.

– Avoid spreading any kind of medical information if you are not a medical professional, and be mindful of the mental state of others before sharing information that may be alarming to others. Avoid making irrational commentary to stop the spread of panic.

– Avoid speculation and do not spread unfounded rumors or speak for any organization without proper authorization. To help stop misinformation, you can visit the World Health Organization (WHO) mythbusters page, and Centers for Disease Control (CDC) FAQ page.

– Be healthy. Remember to eat right, get regular exercise, and plenty of sleep. Eating foods or taking supplements to support your Immune system may not be a bad idea.

– Take a sick day from work to rest and recharge if you’re feeling worn out.

– Slow down and check in on yourself. Ask yourself how you’re feeling and what you need. Also consider asking yourself what you’re grateful for.


Have more ideas? Help build the conversation by sharing your thoughts and ideas with your communities. Together, we can help one another through these challenging times.  In any event, keep focusing on the positives in your life. Keep cool and calm, be willing to listen to others. Be compassionate to everyone, and finally always be kind. 

Introducing A New Tool to Make Finding Your Favorite CNCF Videos Easier

By | Blog

CNCF Staff Blog

CNCF has added a new powerful search and indexing tool for our YouTube channels. VideoLake, by VideoKen, is a unified portal for videos with automated video categorization and integrated deep search. This goal is to make it super easy for you, our community, to discover and explore our extensive and valuable video content. 

Why are we trying this? 

Our events, webinars, and community meetings provide us with a wealth of content, and it’s important that we are able to capture this and share it with you. Our video library is growing quickly and the content within any single video is in-depth. We hope this will make it easier for you to the content you are looking for, or didn’t even know you needed!

Where are we trying this?

We are currently testing this product with two playlists: KubeCon + CloudNativeCon EU, and KubeCon + CloudNativeCon NA 2019. 

How does it work?

Every video has key phrases that define its content and context. The VideoKen AI Player parses through the entire video and indexes these into a word cloud. Users can then click on each phrase and track its occurrence throughout the video. Using an AI engine, videos are automatically categorized and integrated into a search engine. 

We have hardcoded the “top topics” as CNCF’s graduated and incubating projects. This will make it easy for viewers to search, discover, and explore the specific CNCF content you are looking for.

Some of the features we are excited for you to try are:

  • In-video Search – this allows you to search and find contextual terms you are looking for, and their occurrences throughout the video.
  • Enterprise Video Discovery – this automatically creates relevant metadata for all videos in the database and uses it to power the discovery of relevant videos across CNCF’s content.

Your feedback is important to us!

Please try out this new tool and let us know what you think! If this is valuable to our viewers, we will expand this to our other playlists. Please send any feedback to me at kmcmahon@linuxfoundation.org.

 

Join us for the Cloud Native Summit Online on April 7!

By | Blog

CNCF Staff Blog 

By Kim McMahon, Director of Marketing at CNCF and Priyanka Sharma, CNCF board member and GitLab Director

With the postponement of KubeCon + CloudNativeCon EU, and many of our other favorite face-to-face industry events, CNCF, GitLab, and Kong are excited to announce the Cloud Native Summit Online as another event to get the community together! 

Cloud native open source projects, SIGs, and working groups are fundamental to many of our jobs. As we adjust to working remotely and maintaining productivity, we are excited to bring together experts from the community to provide insights and support around cloud native technologies and CNCF projects. 

The virtual event will take place on Tuesday, April 7 from 6:00 am – 2:00 pm PT / 15:00 – 23:00 CET! 

Cloud Native Summit Online will be a live, fun, and interactive event. It will kick off with a welcome from CNCF CTO, Chris Aniszczyk and a subsequent presentation from Mark Coleman, CNCF’s Marketing Committee Chairperson about the Well-Being Working Group. 

Over the course of the event, we will dive into updates from the project maintainers of all CNCF graduated projects: Kubernetes, Prometheus, Envoy, Jaeger, Fluentd, Containerd, CoreDNS, TUF.  We will hear from panels of key SIG and WG contributors and spend time in the hallway track with breakout sessions ranging from lego chats, all remote tips, conversations about the information we just imbibed. In this time of self-isolation and quarantine, family members – both human and pet versions – are welcome to join. There’ll be a kids-oriented breakout called juice-box! 

Register now and we will share event details once we have finalized the live streaming and “hallway” engagement platform! 

If you are interested in getting involved as an organizer, please check out the website for more information on event planning, including ability to drive content structure. 

We hope to see you soon on April 7th on the interwebs!

 

CNCF projects surpass one billion lines of code: A Q&A with DevStats creator Łukasz Gryglicki

By | Blog

Some of you may not be aware that the CNCF community has access to an incredibly valuable reporting tool – DevStats

CNCF began developing DevStats in 2017 to provide the Kubernetes community with timely and relevant insights into how Kubernetes was dealing with nearly unprecedented growth. Today it has grown to encompass all CNCF projects and, as it is open source, can be customized for nearly any project or metric. 

Beyond tracking the stats to monitor the health of all our hosted projects, we also use DevStats in compiling our Annual and Project Journey Reports.

In monitoring DevStats, we just came across an incredible milestone – all CNCF projects combined have surpassed one billion lines of code. That’s right, one billion!

To mark this achievement, we sat down with DevStats creator Łukasz Gryglicki to learn more about the tool, it’s history, and how our community can benefit from it. 

CNCF: What is DevStats?

Łukasz Gryglicki: DevStats is a service that takes data from git and GitHub and turns it into graphs reporting community activity. It’s a CNCF-funded project, as well as a service for all CNCF-supported projects. It organizes and displays project data using Grafana dashboards. We host it on some beefy servers generously donated by Packet.

The way it works is that it downloads several petabytes of data representing every public GitHub action of the last six years, and throws out nearly all of it except for the ~1,400 repositories of CNCF-hosted projects. It processes the data and stores it in a Postgres database, and downloads updated data every hour.

DevStats is now (as of about 9 months ago) a Kubernetes-native application, and uses many other CNCF projects, including Helm, containerd, CoreDNS, and more. DevStats is a fully open source project. It also uses Linux Foundation projects, including Linux (Ubuntu) and Let’s Encrypt, as well as Red Hat’s Patroni for supporting running Postgres databases on Kubernetes.

DevStats also allows users to track custom metrics, not just PR issues or commits. It has many non-standard dashboards, such as analyzing bot activity, company affiliation, contributor location, time zone mapping, gender, programming languages, license types, and many more. 

CNCF: How did DevStats come to be?

LG: CNCF executive director Dan Kohn proposed the initial architecture for DevStats and recruited me to implement it. We had previously worked together at a healthcare startup, Spreemo. My first implementation was in Ruby, but when I re-implemented in Go I was able to take advantage of concurrency to get a 20x performance improvement.

We created DevStats in 2017 as a way for the Kubernetes community to monitor developer and community data. It was created for the Kubernetes Steering Committee and SIG-Contributor Experience, who needed a tool that would allow in-depth analysis and understanding of what was happening in the community. They were also looking for a way to control the development of such a fast-growing project, with Kubernetes becoming the second largest open source community behind Linux. They needed a tool that understood their workflow (like bot commands, and Kubernetes-specific repository labels). One of the biggest requirements was to allow the analysis of historical data to show how trends evolve.

We first presented the project at KubeCon + CloudNativeCon EU 2018 in Copenhagen. Then, to support better scaling and more resource demands, it was moved to Kubernetes. It became an example of how a full Kubernetes application should look, conforming to all best practices as presented at KubeCon + CloudNativeCon EU 2019 in Barcelona. Now, DevStats covers all CNCF projects, some Linux Foundation projects (like Linux and Zephyr), GraphQL foundations, CDF (Continuous Delivery Foundation), Core Infrastructure Initiative, and more.

CNCF: One Billion lines of code across CNCF projects is an impressive milestone! How did we get here, and what does that mean?

LG: This is a huge milestone for CNCF. First, it means that both CNCF and its projects are growing at an incredible pace. When you think about the fact that “Google Chrome has 6.7 million lines of code, and the operating system Microsoft Windows 10 reportedly has 50 million,” one billion seems all the more impressive. As projects work their way through sandbox and incubation to graduation, they grow and become hardened for enterprise use. The DevStats dashboard shows the number of lines of code by project. 

CNCF: Anything else about DevStats the community should know?

LG: DevStats is open source – anyone can fork it and deploy their own instance for their own project(s). We regularly add new dashboards by creating feature requests on DevStats repository, so if you need a special dashboard for your project, file a feature request, and we will review the dashboard for you!

CNCF: Any more exciting new features on the horizon?

LG: More recently, we have been iterating on several modified versions of a project status dashboard based on feedback from the TOC and project maintainers.

We are also in the process of creating RESTful API for DevStats. This means that people will be able to write their own tool, and there is a DevStats API server that can return data on their tool requests. For example, they can write something that queries their project usage daily, and DevStats will return that data as JSON.

Łukasz Gryglicki has been a Senior Developer at CNCF since 2017. Before joining CNCF, Łukasz worked remotely for companies based in the US, including Cleverstep, Jamis, and Spreemo Health.

He loves to travel in polar areas, such as polar Norway, Finland, and Russia. From 2011 to 2012, he was a scientist for a Polish polar expedition to the Hornsund fjord on the island of Spitsbergen, in northern Norway. Łukasz graduated from Warsaw University of Technology with a Master of Science in Engineering. He lives in a small town in Poland with his wife and two kids.

 

1 2 44