Project post by Gaius, Dragonfly Maintainer

Terms and definitions

TermDefinition
OCIThe Open Container Initiative is a Linux Foundation project launched by Docker in June 2015 to design open standards for operating system-level virtualization (and most importantly Linux containers).
OCI ArtifactProducts that follow the OCI image spec.
ImageThe image in this article refers to OCI  Artifact
Image DistributionA product distribution implemented according to the OCI distribution spec.
ECSIt is a collection of resources composed of CPU, memory, and Cloud Drive, each of which logically corresponds to the computing hardware entity of the Data center infrastructure .
CRVolcano Engine image distribution service.
VKEVolcano Engine deeply integrates the new generation of Cloud Native technology to provide high-performance Kubernetes container cluster management services with containers as the core, helping users to quickly build containerized applications.
VCIVolcano is a serverless and containerized computing service. The current VCI seamlessly integrates with the Container Service  VKE to provide Kubernetes orchestration capabilities.With VCI , you can focus on building the app itself, without having to buy and manage infrastructure such as the underlying Cloud as a Service, and pay only for the resources that the container actually consumes to run. VCI also supports second startup, high concurrent creation, sandbox Container Security isolation, and more.
TOSVolcano Engine provides massive, secure, low-cost, easy-to-use, highly reliable and highly available distributed cloud storage services.
Private ZonePrivate DNS service based on a proprietary network VPC (Virtual Private Cloud) environment. This service allows private domain names to be mapped to IP addresses in one or more custom VPCs.
P2PPeer-to-peer technology, when a peer in a P2P network downloads data from the server, it can also be used as a server level for other peers to download after downloading the data. When a large number of nodes download at the same time, it can ensure that the subsequent downloaded data does not need to be downloaded from the server side. Thereby reducing the pressure on the server side.
DragonflyDragonfly is a file distribution and image acceleration system based on P2P technology, and is the standard solution and best practice in the field of image acceleration in Cloud Native architecture. Now hosted as an incubation project by the Cloud Native Computing Foundation (CNCF).
NydusNydus Acceleration Framework implements a content-addressable filesystem that can accelerate container image startup by lazy loading. It has supported the creation of millions of accelerated image containers daily, and deeply integrated with the linux kernel’s erofs and fscache, enabling in-kernel support for image acceleration.

Background

Volcano Engine image repository CR uses TOS to store container images. Currently, it can meet the demand of large-scale concurrent image pulling to a certain extent. However, the final concurrency of pulling is limited by the bandwidth and QPS of TOS.

Here is a brief introduction of the two scenarios that are currently encountered for large-scale image pulling:

  1. The number of clients is increasing, and the images are getting larger. The bandwidth of TOS will eventually be insufficient.
  2. If the client uses Nydus to convert the image format, the request volume to TOS will increase by an order of magnitude. The QPS limit of TOS API makes it unable to meet the demand.

Whether it is the image repository service itself or the underlying storage, there will be bandwidth and QPS limitations in the end. If you rely solely on the bandwidth and QPS provided by the server, it is easy to be unable to meet the demand. Therefore, P2P needs to be introduced to reduce server pressure and meet the demand for large-scale concurrent image pulling.

Investigation of image distribution system based on P2P technology

There are several P2P projects in the open source community. Here is a brief introduction to these projects.

Dragonfly

Architecture

Diagram flow showing Dragonfly architecture

Manager

Scheduler

Dfdaemon

Kraken

Architecture

Diagram flow showing Kraken architecture

Agent

Origin

Tracker

Proxy

Build-Index

Dragonfly vs Kraken

DragonflyKraken
High availabilityScheduler consistent hash ring supports high availabilityTracker consistent hash ring, multiple replicas ensure high availability
Containerd supportSupportSupport
HTTPS image repositorySupportSupport
Community active levelActiveInactive
Number of usersMoreLess
MaturityHighHigh
Is it optimized for NydusYesNo
Architecture complexityMiddleMiddle

Summary

Based on the overall maturity of the project, community active level, number of users, architecture complexity, whether it is optimized for Nydus , future development trends and other factors, Dragonfly is the best choice in P2P projects.

Proposal

For Volcano Engine, the main consideration is that VKE and VCI pull images through CR.

Based on Volcano Engine’s demand for the above products, and combined with Dragonfly’s characteristics, a deployment scheme compatible with many factors needs to be designed. The scheme for deploying Dragonfly is designed as follows.

Architecture

Diagram flow showing Volcano Engine architecture combined with Dragonfly's characteristics

Benchmark

Environment

Container Repository : Bandwidth 10Gbit/s

Dragonfly Scheduler: 2 Replicas,Request 1C2G,Limit 4C8G, Bandwidth 6Gbit/s

Dragonfly Manager: 2 Replicas,Request 1C2G,Limit 4C8G, Bandwidth 6Gbit/s

Dragonfly Peer : Limit 2C6G, Bandwidth 6Gbit/s, SSD

Image

Nginx(500M)

TensorFlow(3G)

Component Version

Dragonfly v2.0.8

POD Creation to Container Start

Nginx  pods concurrently consume time from creation to startup for all pods of 50, 100, 200, and 500

Bar chart showing NGinx Pod Creation to Container Start divided by 50 Pods, 100 Pods, 200 Pods and 500 Pods in OCI v1. Dragonfly, Dragonfly & Nydus

TensorFlow  pods concurrently consume time from creation to startup for all pods of 50, 100, 200, 500, respectively

Bar chart showing TensorFlow Pod Creation to Container Start divided by 50 Pods, 100 Pods, 200 Pods and 500 Pods in OCI v1, Dragonfly ad Dragonfly & Nydus

In large-scale image scenarios, using Dragonfly and Dragonfly & Nydus scenarios can save more than 90% of container startup time compared to OCIv1 scenarios. The shorter startup time after using Nydus is due to the lazyload feature, which only needs to pull a small part of the metadata  Pod to start.

Back-to-source Peak Bandwidth on Container Registry 

Nginx  Pod concurrent storage peak traffic of 50, 100, 200, and 500, respectively

Bar Chart showing impact of Nginx on Container Registry divided in 50 Pods, 100 Pods, 200 Pods and 500 Pods in OCI v1, Dragonfly

TensorFlow  Pod concurrent storage peak traffic of 50, 100, 200, 500, respectively

Bar Chart showing impact of TensorFlow on Container Registry divided in 50 Pods, 100 Pods, 200 Pods and 500 Pods in OCI v1, Dragonfly

Back-to-source Traffic on Container Registry

Nginx  Pod concurrent 50, 100, 200, 500 back to the source traffic respectively

Bar Chart showing impact of Nginx on Container Registry divided in 50 Pods, 100 Pods, 200 Pods and 500 Pods in OCI v1, Dragonfly

TensorFlow  Pod concurrent 50, 100, 200, 500 back to the source traffic respectively

Bar Chart showing impact of TensorFlow on Container Registry divided in 50 Pods, 100 Pods, 200 Pods and 500 Pods in OCI v1, Dragonfly

In large-scale scenarios, using Dragonfly back to the source pulls a small number of images, and all images in OCIv1 scenarios have to be back to the source, so using Dragonfly back to the source peak and back to the source traffic is much less than OCIv1. And after using Dragonfly, as the number of concurrency increases, the peak and traffic back to the source will not increase significantly.

Reference

Volcano Engine: https://www.volcengine.com/

Volcano Engine VKE: https://www.volcengine.com/product/vke

Volcano Engine CR: https://www.volcengine.com/product/cr

Dragonfly: https://d7y.io/

Dragonfly Github Repo: https://github.com/dragonflyoss/Dragonfly2

Nydus: https://nydus.dev/

Nydus Gihtub Repo: https://github.com/dragonflyoss/image-service