By: Ihor Dvoretskyi, Jun Du, Chris O’Haver, Michael Taufen, Hemant Kumar, July 31st, 2018
Kubernetes 1.11 continues to advance maturity, scalability, and flexibility of Kubernetes, marking significant progress on features that the team has been hard at work on over the last year. This newest version graduates key features in networking, opens up two major features from SIG-API Machinery and SIG-Node for beta testing, and continues to enhance storage features that have been a focal point of the past two releases. The features in this release make it increasingly possible to plug any infrastructure, cloud or on-premise, into the Kubernetes system.
Ihor Dvoretskyi: Welcome, everyone. Welcome to the CNCF webinar series. My name is Ihor Dvoretskyi. I’m a developer advocate at Cloud Native Computing Foundation and features lead on the Kubernetes 1.11 release team. Today we’ll speak about Kubernetes 1.11, the latest release, the newest one from the Kubernetes release team.
Ihor Dvoretskyi: I will present this today, Jun Du and Chris O’Haver from Kubernetes SIG-Networking, Michael Taufen from SIG-Node, Hemant Kumar from SIG-Storage, and Kaitlyn Barnard who is 1.11 release communications coordinator, and she’ll help us with running the webinar.
Ihor Dvoretskyi: With this release we have around 25 features that add the new functionality to Kubernetes, or heavily enhance the existing one. The most important of them you can see on the current slide and we’ll highlight in today’s webinar with more details.
Ihor Dvoretskyi: Now I’d like to pass the microphone to Jun Du from Sig Networking, who describes a new feature called IPVS-based in-cluster service load balancing. Welcome, Jun.
Jun Du: Hi. My name’s Jun Du. I’m from Huawei, and I’m the IPVS owner. Glad to see you. Hi, good morning. It’s 1:00 in China. So let’s begin. Okay. As we know, it’s the default mode to process IPTables. So what’s IPTables?
Jun Du: IPTables is a user space application that allows configuring with kernel firewall. The kernel firewall is implemented on top of another filter. IPTables can configure another filter by configuring the chains and the rules. There are five rules and four to five ones for the changes in IPTables, okay.
Jun Du: So what’s Netfilter? Netfilter is a framework provided by this kernel that allows customization of networking related operations such as pack filtering, NAT, NAT means network address translator, for the translation and so on.
Jun Du: So everything seems okay with the IPTables but IPTables doesn’t perform very well in large scale fast. There are two main issues for IPTables acting as the load balancer. There’s the latency and to access service where it becomes very large once the number of services increase and there’s the latency to add or remove IPTables rules where it increases where the number of services and rules become very large.
Jun Du: There are some optimizations to the IPTables, but today I won’t introduce them. I will introduce the IPVS mode because it will go in through the … what we’re working through is the IPTables mode. Mostly let’s have a look at the IPVS.
Jun Du: What’s IPVS? IPVS is the same as IPTables. It’s built on the top of Netfilter, yeah, they are, and the IPVS, it’s a transport layer. It’s level four. Okay, level four, load balance, which directs the requests for TCP, UDP and SCTP based services to real servers. IPVS is the standard for load balancing, not for firewall. Okay. And IPVS supports three load balancing mode: the NAT mode, DR model and IP tunneling mode. Only NAT mode supports the pod transmission, so in Kubernetes, we use the NAT mode, okay. Although DR mode are very fast, but we use the NAT mode. NAT mode, it’s also very fast. It’s yeah, we can almost achieve more than 90% performance number in IPVS mode compared with this is the bare metal.
Jun Du: So why do we need the IPVS? Okay, firstly, it hides bad performance because IPVS uses a hard table because data constructor instead of always a chain. Okay, and there are more load balancing algorithms in IPVS such as Round Robin, source hash and destination hash, and it also supports the algorithms such as based on list loader. This is a question and so on. Okay, we can also extend the width through each of our servers with the Round Robin, with the list connections and so on. And IPVS also supports a server’s highest trick and recognizing the track. It, the data risk code stick session.
Jun Du: We contribute this IPVS mode in, I think it’s Kubernetes 1.8. In 1.8, it’s still a half-half version, and in 1.9, IPVS mode graduates to beta, okay, and in 1.11, it becomes JA, okay. To run, to Kube-proxy in IPVS mode, you only do three things, okay. You need to change three things.
Jun Du: Firstly, load the required kernel modules. We need to load such modules one, two, three, four, five, okay five required kernel modules that are IPVS related and switch … since IPVS mode, it’s not … it’s prox mode, okay. We need to switch the proxy mode to IPVS, and then make sure we enable this feature gateway before 1.10 because before 1.9, it’s still in alpha stage. After 1.9, it’s in beta, so it’s enabled by that point, support the IVPS port mode, to set this as the port mode, okay.
Jun Du: So let’s take a sample, if we Kube-proxy in IPVS mode, Kube-proxy will do the three things when a user creates a service. When in service, firstly, make sure dummy device interface exists in the node. The further name is kube-ipvs0. Secondly, bind the service IP address to the domain device. For cross IP type of services, the service IP is the cross IP, and then for the portal service, okay, the address can be the cross IP and then the new IP. And thirdly, create IPVS virtual service for each service IP and create a real service for each endpoint.
Jun Du: So let’s see some examples or next. This is the last page of this share, okay. Firstly, we create a service, an nginx service. It’s IP, cross IP is 10.102 and so on, okay, and it always uses a port, 3080, okay, and it has two endpoints, okay, as you see. Firstly, we can see there is a kube-ipvs veign, ipvs0, this is the domain device, and we find this cross IP through the device device, okay.
Jun Du: And Kube-proxy will create the IPVS virtual servers and the real servers for the servers and end point. If we type ipvs-admin -ln, then we can see the virtual server and the real servers. As we can, so virtual server is in the Masq, the net mode in IPVS, okay.
Jun Du: Okay, thank you everybody and this off. Thank you. You can ask me questions.
Ihor Dvoretskyi: Thank you Jun. We’ll have a Q&A session at the end of the webinar, so please prepare your questions for that. Thank you Jun, so now we’ll speak about CoreDNS. CoreDNS is an independent project incubated currently at the CNCF. At the same time, in the current Kubernetes release, it’s one of the most important building blocks for the networking stack.
Ihor Dvoretskyi: So now I’d like to welcome Chris from central networking who will speak more about CoreDNS in Kubernetes 1.11.
Chris O’Haver: Hello everyone, let’s go to the first slide. So first, what is CoreDNS. Well CoreDNS is a standalone DNS server. It’s a CNCF incubation project. It’s used as a plug in and chain architecture meaning that DNS requests come in and it passes through a series of plug ins and each plug in performs a function on that request.
Chris O’Haver: It’s flexible. You can combine plugins together for advanced functionality and it’s extensible. It allows you the kinds of framework for writing your own plug in, so if the existing plug ins don’t do what you need.
Chris O’Haver: So that’s what CoreDNS is standalone, but how does it fit into Kubernetes. There’s one plug in called Kubernetes that enables CoreDNS to provide DNS based service discovery in a Kubernetes cluster. It follows the DNS based service discovery specification that is define. It can replace kube-dns because it’s functionally equivalent and CoreDNS is an approved alternative to kube-dns in Kubernetes 1.11. It’s also the default in kube-admin in 1.11 and it’s also an option in kops, not kobs, kubeup, minikube and kubespray and other installers.
Chris O’Haver: CoreDNS fixes a few issues when compared to kube-dns. There’s some open issues that are in kube-dns that are resolved. Some of these are feature requests. Some of these are bugs. I won’t go through all of them but like allowing custom DNS entries through kube-dns, there’s a little bug like missing A records in a certain case.
Chris O’Haver: There’s some outward changes with CoreDNS as compared to kube-dns. A somewhat minor one is that, at least from a usability perspective, is that kube-dns has three containers, CoreDNS has one. From a metrics perspective, they both report metrics to Prometheus but the set of metrics differ. Part of the reason is because of the one container versus three. From a configuration perspective, that’s the biggest change you’ll see. There’s migration tools available for converting the kube-dns config map to CoreDNS. CoreDNS is fully configurable by the config map and kube-dns is not. That’s another difference.
Chris O’Haver: Here’s an example of a CoreDNS configuration. I’m trying to keep it short here, so I won’t go into all the details, but that bottom section there is pretty much what you’d see in a default install from kube-admin. The top section is an example of what a stub demand looks like. That’s basically saying everything that comes in on port 53, for example, dot com gets proxied out to one, two, three, four, and we cached it for 300 seconds. The bottom section shows health, monitoring, the Kubernetes plug in itself with the default settings, Prometheus, which enables metrics reporting, and then the fall through, sort of fall through default forwarding everything else through that same resolve.conf.
Chris O’Haver: So to highlight a couple of new features that CoreDNS has, one is verified pod records. Here’s an example of what a pod record looks like and in kube-dns, all IP style names like that where you have an IP that says dashed, it will return a record even if that pod doesn’t exist. As long as it’s a valid IP address, it will say, “Yeah, that exists,” and give you an answer. In CoreDNS, you have the option to only create those records for real pods and to enable that, you would use the pods verified under configuration.
Chris O’Haver: Another cool feature which is not enabled by default but you can if want to enable is server side searching with Autopath plug in. I’m not sure if you’re familiar with the ndots 5 problem in Kubernetes, but if you aren’t, a good example of where that goes badly is when you do a search for an external domain, for example, Infoblox.com from a pod. It’s first going to try the name space and then it’s going to try the next level up and the next level and the next level up, and so what happens is you end up with around six exchanges between the pod and the DNS server before you get a real answer.
Chris O’Haver: So what CoreDNS does with the Autopath feature is it answers this in one trip. It understands that odd style query at the beginning is the beginning of a long search and it just short circuits it and gives it the final answer in one step. One thing that makes this problem worse in kube-dns is that none of those NXDOMAIN responses are cached because there’s no negative caching by default, but it is all cached in CoreDNS.
Chris O’Haver: Some other features, won’t go into as much detail in, we love zone transfers for the Kubernetes domain. We have name space and label filtering, so you can expose just a limited set of your services if you want. You can adjust the TPL up and down depending on your needs and we support negative caching by default.
Chris O’Haver: So I mentioned that we’re built on a plug in chain and so there are a lot of built in plug ins. There’s 34 when I counted lasted. A few interesting ones are and ones that are used a lot are: File, it allows you to serve a zone from a zone file; Rewrite, it allows you to rewrite requests as they come in, it allows you to do kind of behind the scenes aliasing, looks invisible to the client; and Template, it’s great for debugging, for example, it allows you to define responses based on a regex and go template.
Chris O’Haver: And if you’ve got other needs that plug ins, the built in plug ins don’t solve, you can build your own plug ins. There’s some cool external plug ins that other people have built: unbound, uses the unbound library to do recursive DNS; pdsql does, it’ll serve records from a backend database; redisc will use redis as a shared cache across multiple CoreDNS instances; and kubernetai, which is the plural of Kubernetes, in case you were wondering, it allows you to connect the CoreDNS to multiple Kubernetes clusters for cross-cluster discovery.
Chris O’Haver: Real quick, this is the road map. In 1.9, we were alpha. Beta in 1.10. Now we’ve graduated to GA. We’re not the default yet but we are default in kubeadmin installs. The future goals would be to make CoreDNS the default replacing kube-dns and eventually deprecate kube-DNS.
Chris O’Haver: Some links here. If you have any questions or issues or want support, we’re very active in GitHub and in Slack. For any security related issues, send to security@CoreDNS.io. Each plug in is documented and you can find that documentation on the CoreDNS website, and also thanks to John Belamaric, I pretty much took 99% of the content of this presentation from his blog, which I’ve linked here.
Chris O’Haver: Thanks everyone and also we’re conducing a survey to try to see who’s using CoreDNS and what plug ins they’re using. There’s a link here. Thank you.
Ihor Dvoretskyi: All right. Thank you Chris. I’m really excited to see how big progress CoreDNS made during the last, during the latest period of time from the is it a replacement for Kubernetes’ built in DNS networks, their default one these days.
Ihor Dvoretskyi: So besides networking, another major area in Kubernetes is node functionality. In fact, I’d like to welcome Mike Taufen from SIG-Node, who will talk about the Dynamic Kubelet Configuration feature.
Michael Taufen: Hi everyone. I’m Mike Taufen. I work in SIG-Node in Kubernetes and this release, we launched to beta, a pretty cool new feature called Dynamic Kubelet Configuration.
Michael Taufen: So why, first before we get into exactly what it is, why do we want to do something called Dynamic Kubelet Configuration. So just for some history, Kubernetes is built around the idea of offering declarative APIs that are hosted in a central control plane, that’s your API server. In Kubernetes, you can use these APIs to configure most things about the system unless the thing you’re trying to configure isn’t already already represented by a core abstraction of that system. So most things run in pods but most Kubernetes deployment don’t run the Kubelet in a pod. Some do but most don’t.
Michael Taufen: So by lifting the Kubelet configuration into the control plane, we make it more visible and in many cases, more convenient to manipulate. There are certainly ways to dynamically to configure a Kubelet in Kubernetes today. Some people just SSH into nodes and manually mess with things. Some people use third party configuration management tools, but really none of these ways are built with Kubernetes in mind and they don’t have a first class user experience around them.
Michael Taufen: So what is dynamic Kubelet configuration? Well in Kubernetes 1.10, we launched the ability to configure the Kubelet via a structured version, a Kubernetes style config figure API, as opposed to using command line flags, and so to use this, you simply passed the path to this file to the Kubelet.
Michael Taufen: And in 1.11, we launched this dynamic Kubelet configuration feature to beta, which is basically just a way to deliver this config file in a live cluster to a live node and have the Kubelet switch to using the new configuration. So you use the exact same structured version, a Kubernetes style config file format as the file you would normally just write locally, but you can post this config file into the Kubernetes control plane via a config map, and then you can tell nodes to refer to that config map that contains your config file. And once you tell the node to refer to it, the Kubelet will download that config and restart to use the new configuration.
Michael Taufen: So before we get into the low level details of what the API for this looks like, let’s just go through a high level work flow from a user standpoint to reconfigure a single node in their cluster. So the first thing that the user would do is to write that config file and post that config file in a config map to their cluster control plane, and then once the user has done that, there’s a field on the node object in spec called the ConfigSource.ConfigMap that the user would update to refer to that config map. That fills effectively an object reference.
Michael Taufen: So once that field is updated, the Kubelet associated with that node object is maintaining a watch on that node and so it will see the update, and it will then go find that reference config map, download it, unpack the files in it and store that locally as what we call a config checkpoint internally. Once that config is check pointed, your Kubelet will restart to try the new config check and then depending on what happens, the Kubelet will update the node status and say, “Yes, I’m using this new config. It’s great. Everything is happening,” or “The new config didn’t pass my validation, for example. So I’m going to report an error here and I’m just going to fall back to the check point that I had previously until you can fix whatever the problem was, so I don’t disrupt your work loads.”
Michael Taufen: So this work flow extends to multi-node roll outs. Instead of simply updating the spec for one node, you update it for all the nodes you want to refer to the new config map because you have control over which nodes refer to the new config win, you can control the rate and policy surrounding that roll out.
Michael Taufen: So there’s a lot of detail on this slide. This is an example of the actual objects in the control time that we’re dealing. So you can see on the right, we have the config map that contains here in this slide, a truncated Kubelet configuration. The Kubelet configuration space is actually quite large. There’s over 100 fields. And on the left, we have the node object that’s being configured in this example, and in the spec for that node object, you can see that there’s the ConfigSource.ConfigMap field and in that field we’re just specifying the symbolic information that’s necessary to refer to the config map and also note that we’re saying which specific key the Kubelet config file of the config map is in, and if you follow the red arrows, those match with the config map there.
Michael Taufen: And then in the status of the node object, you’ll actually see that there’s three sort of config source representations there and those are, you can think of those as the three things the Kubelet cares about. So the first one is simply what the Kubelet believes it has been assigned to use and that’s sort of like an ack. Once the Kubelet has downloaded a new checkpoint and recorded locally which checkpoint it’s supposed to use and come back up, it will report that. So you can see that it at least acknowledged, “Oh I saw you reconfigured me.”
Michael Taufen: The next one is called active and that is the configuration checkpoint that the Kubelet chose to use at run time for the current run time duration. So the active can actually be equal to either the config source you assigned or possibly the last known good config source if you assigned an invalid config for example. So then, the final piece here is that last know good config source. The Kubelet does, it basically waits for a period of time while it runs a config and once it’s run that config for that period of time, it says, “Okay, I’m pretty comfortable with this config.” We might do more advanced things in the future to determine whether a config is causing too many crashes or something like that, but today it just validates the config in that sort of period of time. Once a config becomes the last known good, it’s also reported in the status.
Michael Taufen: And then finally, you can see the config map in this example slide has an error, kubeAPIQPS is set to a string that is not valid and so here, that error. The fact that there was a validation error is being reported in the status as well. For pretty much all the validation errors and other kinds of errors that can be reported in the status, if you just search the exact same text in the Kubelet log, you’ll get a more detailed log message.
Michael Taufen: A few other different notes on this stuff. So the node status is explicit regarding the uid and resource version the Kubelet actually downloaded since technically all this stuff is mutable in your cluster and you want to be able to tell if maybe somebody changed something when they shouldn’t have, and we specifically don’t have the ability to pin the symbolic information in the spec to a uid or a resource version for a number of reasons.
Michael Taufen: One of which is that it’s impossible to statically specify these versions because they’re assigned by the API server. In addition to that, the fact that there’s no real history that’s easy to query for like, “Oh give me this resource version of this resource.” It means that it will be very hard to debug what the actual intent was in a live environment. If you do require versions on your configuration, we recommend that you use, inject version information into the name of the config map that you post and ideally treat that config map as immutable. There’s a Kube control flag called dash dash or apend hash when you’re creating config maps that can help with this by appending a hash of the data in the config map to the name when you create it.
Michael Taufen: So we don’t live in a perfect world, so there are plenty of gotchas with all kinds of things. One is that the configuration API is a little bit low level. So because we’re using the same file format, this lets you set the whole config dynamically but this is powerful and you should be careful. Some things are very safe to change like QPS, some things you probably shouldn’t touch like the names of the cgroups.
Michael Taufen: There is inline documentation at the link in this slide for field advice on which things you should be more careful with. So if you follow that link, you can find more information and just as a reminder, the intent of this feature is to target system experts and service providers who can qualify these kinds of things and understand the system very well.
Michael Taufen: Another gotcha is that we are moving to this config file API from command line flags, and so there’s still a lot of command line flags that are deprecating but not yet removed and can still specify the same information as in the config file. For backwards compatibility reasons, these flags take precedence if they’re still specified on your command line, so you do need to know a little bit still today about how your node was initially deployed to understand the full spectrum of configuration applied to it. We’re hoping to reduce that service area in the future as we migrate these flags.
Michael Taufen: So these are just some thoughts on future work. As I noted in the previous slide, there are a lot of low level knobs in configuration today. This is the case for both the Kubelet and a lot of other components. Most users actually want high level policy. When users tune different configuration parameters, they’re not just doing it for the heck of it, they actually have an end goal. They’re trying to optimize for something.
Michael Taufen: So one open question is can we have or offer high level opinions or name strategies that satisfy common use cases and tuning cases or they’re high level principles that work for everyone that we can just bake into our components instead of exposing knobs to two things. The higher level we can make these configuration surface areas, the friendlier the system will be to non-experts and also experts.
Michael Taufen: So another item is that dynamic Kubelet configuration is a manual pro node configuration process so there’s no type of node pool type orchestration built in but the Kubernetes cluster API effort is focused on solving this among many other issues, and you can follow that link to find out more about that. And of course, we are continually migrating away from flags into these versioned configure file APIs. And then just one more slide here.
Michael Taufen: So these are some other links you can follow, the first is the blog post with some details on dynamic Kubelet configuration. The second two are the official Kubernetes documentation for using the feature, in addition to the file based configure API feature. This third link is a doc on the philosophy behind why we should use versioned configuration files instead of command line flag APIs. So if you’re interested in some of the rationale behind this migration, I’d recommend reading that. And then the final doc is just a general document on the API philosophy behind Kubernetes. Given that we’re migrating all this Kubernetes style APIs, that’s one of the primary motivators. Thank you everyone.
Ihor Dvoretskyi: Great. Thank you, Mike. So now we’re passing to the storage area and storage keys, yet another important area in Kubernetes. Those who are following our Kubernetes release webinar series will noticed that we are constantly highlighting the storage features here. And now I’d like to invite Hemant Kumar from sig-storage who will speak to you all about precise and persistent volumes.
Hemant Kumar: Hello everyone. Hemant here. I’m going to talk about the resizing persistent volumes. This is something we introduced in 1.8 and we are introducing as beta in 1.11.
Hemant Kumar: Okay, so what is persistent volume? Persistent volumes provide a storage level in Kubernetes which is useful for, I think, applications that need to process the data. For example, databases or Prometheus or applications that need state.
Hemant Kumar: There are three building blocks of process volumes, if you’re not already familiar like them like storage classes. They are like templates from which persistent volumes are created. Persistent volume is also called PVs, actual cluster scopes, volume objects, which tie to underlying storage and then there’s persistent volume claims, which is basically a user managed object, which user can use to create persistent volumes. Again, a storage class in persistent volumes are cluster scope resources and persistent volume name is a name sourced resource. That is something that … claim is what most users typically work with.
Hemant Kumar: So what problem resizing solves? So the … and what problem is that once we provision PV and PVCs, they remain a fixed size throughout their life cycle, and as applications that use them need more storage, it becomes essential to move them to larger PVs, and that can be tricky. You’ll have to, typically have to manually create PVs, PVCs, back them up and then delete all the PVCs and then restart the pod that are reusing them. It becomes kind of challenging. For a straightforward application, it becomes even harder because the PVCs doesn’t really tie to the pods.
Hemant Kumar: So what does resizing persistent volume mean? It means in place expansion of volume by editing the persistent volume claim object. Any user can go ahead and edit the PVC object and get a new updated volume that has a new size. We do not support shrinking or reusing sizes for system volumes, and when we say resizing, then we expand both the remote volume object, which could EBS volume or it could mean GCE-PD or Ceph RBD volume and it also expands the file system on the node. Included in 1.11, we support these volumes, EBS, GCE, Azure Desk, Azure File, ClusterFS, Cinder, Portworx and Ceph RBD. One exceptional case is that generally it requires file system expansion, you have to restart the pod. That’s something we will cover in detail in the next slides.
Hemant Kumar: So how do we expand persistent volumes? As I mentioned the feature as alpha in 1.8 and 1.11 it’s going beta, but a cluster admin must opt in for this feature or enable expansion of PVCs that are created from storage class which has allowed volume expansion set to true. Only PVCs created from the storage class that has allowed volume expansion is allowed to be expanded. Other PVCs are by default not allowed to be expanded. This is to ensure that a cluster admin has hooks and knobs so that the users are aware and the feature doesn’t become available to everyone.
Hemant Kumar: So how do we expand persistent volume? It’s pretty simple. You just need to run kubectl edit pvc and then PVC name in your editor. You can just spec resources, request and storage, like you can see, you can request, you can edit the size, make it from 4, 5GB it was previously and I’m requesting 10GB. So it will expand the volume. The status is still showing 5GB because the recent volume is 5GB, so you can go ahead and edit your PVC and save the PVC and the resize process will start.
Hemant Kumar: So after editing PVC, for shared file systems like Glusterfs, Azure File, if you have a pod that is using the PVC, the expanded storage is immediately available to the pod. There’s nothing else you need to do. The pod should have the new capacity. Everything should work as it is, no problem.
Hemant Kumar: But for block storage volume types like, not EBC sorry, EBS and GCPD and Azure Desk, you will have to restart the pod, so that the volume could be expanded on the node. So that’s one pretty question of … so for the complete resizing process to finish.
Hemant Kumar: Expanding file systems requires restarting the pod and once the underlying volume has been resized. So the steps generally should be do as like edit the PVC to request more space. Once the underlying volume has expanded by the storage provider, then the PVC will have this condition, FileSystemResizePending condition, and as you can see that in this case it has this condition FileSystemResizePending and then wait for the PVC to have the condition and restart the pod. The pod will restart by deleting and recreating or by scaling down the deployment and scaling up, and once you’re done that, the PVC should come back with the … the pod should come back with the new size level to it and you can use that additional space.
Hemant Kumar: As I mentioned earlier, included is 1.11, volume resizing graduates to beta. As a result, it’s enabled by default and you no longer have to enable the feature gate. But included is 1.11 also introduces a new feature called ExpandInUsePersistentVolumes. What that will do is, in the previous slide you could see that we have to restart the pod, delete and recreate or scale up and scale down the pod for the file system resizing to be finished. But if you have this feature that’s enabled, it’s a user alpha feature, then you have to do anything. The file system will be automatically expanded on the Kubelet and the new size will be available to the pod.
Hemant Kumar: One thing to keep in mind is that automatic file system expansion’s only supported for volumes that are in use by a running pod. That means that Kubernetes itself will not schedule the volume to a node so that it can be resized. No, it won’t do that. You’ll have to, for the automatic online expansion to work, there must be a pod that is using the volume.
Hemant Kumar: So we are also working on enabling volume expansion for CSI. There’s a link there and then we have 1.12, we are also working on enabling volume expansion for Flex and we are trying to improve the general stability and improving the file system expansion support. So those are some of the feature works we are planning.
Hemant Kumar: And you have … we have some, the persistent volume documentation in Kubernetes covers volume expansion, how it works and the blog post, and there’s a how the persistent volume expansion will work for CSI, there’s a poll request that you can have a look at it. So yeah, I think that’s all for me. Thank you.
Ihor Dvoretskyi: Okay Hemant. So now it’s a Q&A time. I’d like to remind you that you may ask your questions in the Q&A section in the webinar software interface. Also I’d like to remind you that this session was recorded and will be available soon at CNCF.io website in the events section.
Ihor Dvoretskyi: So let’s start from the existing questions. So the first question is from Sadeep, so is the Kubelet and Kube-proxy both running as pods? I’m not sure who is the right person to answer this. So probably Mike.
Michael Taufen: Yeah, I know the answer so I can answer it. So typically the Kubelet in most Kubernetes deployments does not run as a pod. Kube-proxy however does usually. A lot of people run it as a demon set. I can also answer the most recent question. The answer which is [Zenit 00:46:07], does dynamic Kubelet config mean that we run Kubelet in a container and the answer is no. Dynamic Kubelet config is necessitated by the fact that we don’t run the Kubelet in a container in most cases.
Michael Taufen: If you did run the Kubelet in a pod, for example, you could potentially plunk config to it via config map volume source, which is how most people dynamically configure work loads with config files, but we don’t do that so we don’t get that channel. Thanks.
Ihor Dvoretskyi: Thank you, Mike. So, another question is about storage area. So Hemant’s the best person to answer here. So volume expansion also needs feature gate enabled, so does it need a feature gate enabled?
Hemant Kumar: No, in 1.11, we have the feature is in beta, so it’s enabled by default.
Ihor Dvoretskyi: Great, and another question is it’s also about storage. It’s about storage. So what, in terms of Ceph RBD or they had extra space and they only virtually use in our RBC?
Hemant Kumar: It depends. When we expand the volume and what the user has requested, like let’s say the user is requesting 10GB space and the server was already of 10GB space, then the operation is basically a no hub and not a no hub, but it will consider the expansion as successful and it will finish the entire operation, but if let’s say, the user is requesting 10GB and the underlying Ceph RBD volume has 8GB, but you have used only 4GB. But Kubernetes doesn’t know that there’s still 4GB on it, so it will try to expand the volume to 10GB that user has requested and it will go through the whole resizing, the volume expansion workflow and it will finish the resize process.
Ihor Dvoretskyi: Great. Thank you. The next question is a marginal question about the performance of Kubernetes, so probably someone here, anyone from the current presenters may have an answer. So please tell us an update on any performance which marketing sales and the reports that are available for production use of Kubernetes. What is the maximum number of physical services … servers, excuse me, can be used and how many parts it supports? And how many VMs can be used?
Ihor Dvoretskyi: I can start from the latest part, so what is the maximum number of servers or nodes can be used. So officially to date, Kubernetes supports up to 5,000 nodes in a single cluster. Probably someone here or has any insights on the benchmarking tools for Kubernetes?
Chris O’Haver: So we don’t have anyone from SIG-scalability here right now, but there’s active work in kind of expanding that and building tests around expanding that capacity. There is a link that I’ll pop in the chat around building large clusters. So I’ll just read off some of the high level notes.
Chris O’Haver: So at Kubernetes 1.11, Kubernetes supports clusters with up to 5,000 nodes. More specifically, it will support configurations that meet all the following criteria: no more than 5,000 nodes, no more than 150,000 total pods, no more than 300,000 total containers and no more than 100 pods per node.
Chris O’Haver: What I can say about … so we have limits that you can set on the Kubelet that will allow you to tweak both the maximum pods per node, as well as the maximums pod per core. From let’s say an open shift perspective, it supports 250 pods per node by default, but again this is a flag that you can set. We have seen clusters push 300, 500 pods per node.
Chris O’Haver: From the pods per core perspective, it’s more arbitrary depending on the applications that you’re trying to design. It’s more of a let’s see the application run unbounded and then decide what we want to do from there. That will give you a better idea of what your benchmark can be. But we do, I can say from the sig scalability perspective that we do run tests and we have been integrating more of the scalability test within the test suites, so you’ll see more activity in that in the future at least.
Ihor Dvoretskyi: Great answer, thank you. Let’s move on to the different questions. So where do official supported Docker version change somewhere in the future? Probably Mike, since you’re working on a signal, you have some thoughts about this.
Michael Taufen: Yeah, if you’re looking for a specific timeline, I don’t know anything. The general answer is yes, probably because it does change periodically but yeah, I don’t really know more than that. Sorry.
Ihor Dvoretskyi: Yeah, no problem, it’s a bit out of scope of our current webinar.
Chris O’Haver: Yeah, just, so I’ll pop in one more time-
Ihor Dvoretskyi: Yeah.
Chris O’Haver: So just from a docker and run time support ability standpoint in general, you’re going to see more and more developments around the community moving to the idea of OCI compliant containers. So not just docker but everything that supports the OCI spec. So watch news around crypio, which is kind of what we’re going to be moving towards.
Ihor Dvoretskyi: Great, thank you. Another question is does anyone have any sort of concerns in knative components. I would say that knative is totally out of scope with the current webinar. At the same time, I wouldn’t pursue to visit their yeah, Kubernetes users mailing list and discourse are how Kubernetes to discuss this question with the community members.
Ihor Dvoretskyi: So any other questions to our current presenters about Kubernetes 1.11? We have a few minutes more here. Nope, so I don’t see any extra questions in the Q&A section. So thank you for joining us today. I’d like to remind you that CNCF webinars are happening every Tuesday at this time in your time zone, so please join us next week for the upcoming webinar. Thank you for joining us today and goodbye.