tag:blogger.com,1999:blog-6147153514709733282024-03-13T23:53:10.853-07:00Vadim Eisenberg's blogI am a software developer, ex-IBMer.
Disclaimer: I do not warrant the correctness of the information provided or its fitness for any purpose.Vadim Eisenberghttp://www.blogger.com/profile/15064590545159507001noreply@blogger.comBlogger3125tag:blogger.com,1999:blog-614715351470973328.post-76990057308984138412022-03-22T08:17:00.001-07:002022-03-23T01:17:40.499-07:00<h1 style="border-bottom: 1px solid var(--color-border-secondary); box-sizing: border-box; caret-color: rgb(36, 41, 46); color: #24292e; line-height: 1.25; margin-bottom: 16px; margin-left: 0px; margin-right: 0px; margin-top: 0px !important; margin: 0px 0px 16px; padding-bottom: 0.3em;"><span style="font-family: arial;">Kubernetes scalability bottlenecks: can Kubernetes scale to manage one million objects?</span></h1><p style="box-sizing: border-box; caret-color: rgb(36, 41, 46); font-size: 16px; margin-bottom: 16px; margin-top: 0px;"><span style="font-family: arial;"><span style="color: #24292e;">While originally Kubernetes was designed as a platform for orchestrating containers, lately Kubernetes is becoming a platform for orchestrators or for applications for management of other entities. Kubernetes can run applications that manage </span><span style="color: black;"><a href="https://kubevirt.io">VMs</a>, <a href="https://www.redhat.com/en/technologies/management/advanced-cluster-management">multiple Kubernetes clusters</a>, <a href="https://kubeedge.io/en/">edge devices and applications</a>, and can also orchestrate provisioning of <a href="https://metal3.io">Bare Metal machines</a> and of <a href="https://github.com/openshift/assisted-installer">OpenShift clusters</a>, among other things.</span></span></p><p style="box-sizing: border-box; caret-color: rgb(36, 41, 46); color: #24292e; font-size: 16px; margin-bottom: 16px; margin-top: 0px;"><span style="font-family: arial;">In this blog post I describe common characteristics and a generic architecture of <em style="box-sizing: border-box;">management applications</em> that use the Kubernetes API for representing application state.<span face="-apple-system, BlinkMacSystemFont, "Segoe UI", Helvetica, Arial, sans-serif, "Apple Color Emoji", "Segoe UI Emoji"" style="background-color: white;"> I classify the </span><em style="box-sizing: border-box; font-family: -apple-system, BlinkMacSystemFont, "Segoe UI", Helvetica, Arial, sans-serif, "Apple Color Emoji", "Segoe UI Emoji";">management applications</em><span face="-apple-system, BlinkMacSystemFont, "Segoe UI", Helvetica, Arial, sans-serif, "Apple Color Emoji", "Segoe UI Emoji"" style="background-color: white;"> that use the Kubernetes API as either </span><em style="box-sizing: border-box; font-family: -apple-system, BlinkMacSystemFont, "Segoe UI", Helvetica, Arial, sans-serif, "Apple Color Emoji", "Segoe UI Emoji";">Kubernetes-native</em><span face="-apple-system, BlinkMacSystemFont, "Segoe UI", Helvetica, Arial, sans-serif, "Apple Color Emoji", "Segoe UI Emoji"" style="background-color: white;"> or as </span><em style="box-sizing: border-box; font-family: -apple-system, BlinkMacSystemFont, "Segoe UI", Helvetica, Arial, sans-serif, "Apple Color Emoji", "Segoe UI Emoji";">fully-Kubernetes-native</em><span face="-apple-system, BlinkMacSystemFont, "Segoe UI", Helvetica, Arial, sans-serif, "Apple Color Emoji", "Segoe UI Emoji"" style="background-color: white;"> (I invented the latter term). Lastly, </span>I examine <span style="box-sizing: border-box; font-weight: 600;">scalability bottlenecks of Kubernetes</span> that can hinder applications that manage a large number of objects using the Kubernetes API. I outline possible workarounds for some of the bottlenecks.</span></p><h2 style="text-align: left;"><span style="font-family: arial;">Kubernetes-native applications</span></h2><p style="box-sizing: border-box; caret-color: rgb(36, 41, 46); color: #24292e; font-size: 16px; margin-bottom: 16px; margin-top: 0px;"><span style="font-family: arial;">Some of the management applications mentioned above describe themselves as <em style="box-sizing: border-box;">Kubernetes-native</em> or <em style="box-sizing: border-box;">Kubernetes-style</em>: they are implemented and operated using Kubernetes API, CLI and tools from the Kubernetes ecosystem. I did not find a precise definition of what <em style="box-sizing: border-box;">Kubernetes-native</em> is. Sometimes, a more broad term <em style="box-sizing: border-box;">cloud-native</em> is used. In the following list I provide common characteristics of Kubernetes-native applications. The list is not meant to be exhaustive and not all Kubernetes-native applications have all the characteristics.</span></p><ul style="box-sizing: border-box; caret-color: rgb(36, 41, 46); color: #24292e; font-size: 16px; margin-bottom: 16px; margin-top: 0px; padding-left: 2em;"><li style="box-sizing: border-box;"><span style="font-family: arial;">Applying the <a href="https://kubernetes.io/docs/concepts/architecture/controller/#controller-pattern">controller pattern</a>, or a special kind of the controller pattern called the <a href="https://kubernetes.io/docs/concepts/extend-kubernetes/operator/">operator pattern</a>.</span></li><li style="box-sizing: border-box; margin-top: 0.25em;"><span style="font-family: arial;">Representing control and configuration information as <a href="https://kubernetes.io/docs/concepts/configuration/configmap/">Kubernetes ConfigMaps</a>, <a href="https://kubernetes.io/docs/concepts/configuration/secret/">Kubernetes Secrets</a> and <a href="https://kubernetes.io/docs/concepts/extend-kubernetes/api-extension/custom-resources/">Kubernetes Custom Resources</a>.</span></li><li style="box-sizing: border-box; margin-top: 0.25em;"><span style="font-family: arial;">Using Kubernetes software libraries, such as <a href="https://github.com/kubernetes/client-go">client-go</a>, <a href="https://github.com/kubernetes-sigs/controller-runtime">controller-runtime</a>, <a href="https://sdk.operatorframework.io">OperatorSDK</a>.</span></li><li style="box-sizing: border-box; margin-top: 0.25em;"><span style="font-family: arial;">Application components report <a href="https://prometheus.io/docs/concepts/metric_types/">Prometheus metrics</a>, emit <a href="https://kubernetes.io/docs/tasks/debug-application-cluster/debug-application-introspection/">Kubernetes events</a>, define <a href="https://kubernetes.io/docs/tasks/configure-pod-container/configure-liveness-readiness-startup-probes/">Kubernetes liveness, readiness and startup probes</a>.</span></li><li style="box-sizing: border-box; margin-top: 0.25em;"><span style="font-family: arial;">The applications use <a href="https://landscape.cncf.io">CNCF-landscape</a> tools: such as <a href="https://prometheus.io">Prometheus</a> for monitoring and alerts, <a href="https://grafana.com">Grafana</a> for observability, Service Meshes such as <a href="https://linkerd.io">Linkerd</a> for advanced traffic control and security, <a href="https://www.openpolicyagent.org">Open Policy Agent</a> for policy-based control, <a href="https://crossplane.io">Crossplane</a> for infrastructure management, and many others.</span></li><li style="box-sizing: border-box; margin-top: 0.25em;"><span style="font-family: arial;">The applications are operated using <a href="https://about.gitlab.com/topics/gitops/">GitOps</a> and <a href="https://en.wikipedia.org/wiki/Continuous_deployment">Continuous Deployment</a> tools for Kubernetes, such as <a href="https://fluxcd.io">Flux</a> and <a href="https://argo-cd.readthedocs.io">ArgoCD</a>.</span></li></ul><p style="box-sizing: border-box; caret-color: rgb(36, 41, 46); color: #24292e; font-size: 16px; margin-bottom: 16px; margin-top: 0px;"><span style="font-family: arial;">In my opinion, a Kubernetes-native application does not have to run on a Kubernetes cluster. It can run elsewhere, communicate with the Kubernetes API of some Kubernetes cluster and have all of the characteristics above or a subset of them.</span></p><h2><span style="font-family: arial;">Fully-Kubernetes-native applications</span></h2><p style="box-sizing: border-box; caret-color: rgb(36, 41, 46); color: #24292e; font-size: 16px; margin-bottom: 16px; margin-top: 0px;"><span style="font-family: arial;">Some of the applications even store <span style="box-sizing: border-box; font-weight: 600;">all the configuration data</span> for the objects they manage using the Kubernetes API. I call such applications <em style="box-sizing: border-box;">fully Kubernetes-native</em>.</span></p><p style="box-sizing: border-box; caret-color: rgb(36, 41, 46); color: #24292e; font-size: 16px; margin-bottom: 16px; margin-top: 0px;"><span style="font-family: arial;">There are multiple advantages of the fully-Kubernetes-native applications:</span></p><ul style="box-sizing: border-box; caret-color: rgb(36, 41, 46); color: #24292e; font-size: 16px; margin-bottom: 16px; margin-top: 0px; padding-left: 2em;"><li style="box-sizing: border-box;"><span style="font-family: arial;">since the applications store all their data in the same database Kubernetes stores its data (<a href="https://kubernetes.io/docs/tasks/administer-cluster/configure-upgrade-etcd/">etcd</a>), there is no need to operate an additional database, to define its schema and to program access to it. The applications get schema validation and versioning of their data <em style="box-sizing: border-box;">for free</em>. The application developers only need to learn the <a href="https://kubernetes.io/docs/reference/using-api/api-concepts/">Kubernetes API </a>and <a href="https://kubernetes.io/docs/tasks/extend-kubernetes/custom-resources/custom-resource-definitions/">Kubernetes Custom Resource Definitions</a>. There is no need to learn SQL or other database query and data-definition languages.</span></li><li style="box-sizing: border-box; margin-top: 0.25em;"><span style="font-family: arial;">the applications get <em style="box-sizing: border-box;">for no cost</em> an API server (<a href="https://kubernetes.io/docs/concepts/overview/components/#kube-apiserver">the Kubernetes API Server</a>) and a CLI (<a href="https://kubernetes.io/docs/reference/kubectl/overview/">kubectl</a>) for their data. <code style="border-bottom-left-radius: 6px; border-bottom-right-radius: 6px; border-top-left-radius: 6px; border-top-right-radius: 6px; box-sizing: border-box; font-size: 13.600000381469727px; margin: 0px; padding: 0.2em 0.4em;">kubectl</code> can be extended for the needs of the application using <a href="https://kubernetes.io/docs/tasks/extend-kubectl/kubectl-plugins/">kubectl plugins</a>.</span></li><li style="box-sizing: border-box; margin-top: 0.25em;"><span style="font-family: arial;">the applications use <a href="https://kubernetes.io/docs/reference/access-authn-authz/authentication/">Kubernetes authentication</a> and <a href="https://kubernetes.io/docs/reference/access-authn-authz/authorization/">Kubernetes authorization</a> mechanisms. In particular, the application admins can define <a href="https://kubernetes.io/docs/reference/access-authn-authz/rbac/">Kubernetes Role-Based Access Control (RBAC)</a> rules to specify the operations the users can perform on managed objects, without being required to use external authorization systems.</span></li></ul><p style="box-sizing: border-box; caret-color: rgb(36, 41, 46); color: #24292e; font-size: 16px; margin-bottom: 16px; margin-top: 0px;"><span style="font-family: arial;">Kubernetes-native applications that are not fully Kubernetes-native, store some of their state in a database, other than etcd. They have to implement data access to the database, authorization for the data access and some API server on top of the database.</span></p><p style="box-sizing: border-box; caret-color: rgb(36, 41, 46); color: #24292e; font-size: 16px; margin-bottom: 16px; margin-top: 0px;"><span style="font-family: arial;">Implementing applications in the <em style="box-sizing: border-box;">fully-Kubernetes-native</em> way might spare development and operational effort, and reduce skill requirements. The question, though, is: <span style="box-sizing: border-box; font-weight: 600;">can such applications scale to manage a large number of objects?</span> If one wants to manage <a href="https://www.ibm.com/cloud/smartpapers/5g-edge-computing/">5G network infrastructure or edge devices</a>, one might need to manage tens of thousands or hundreds of thousands or, in the future, maybe even millions of objects. In the remainder of this blog post I describe Kubernetes scalability bottlenecks that can hamper <em style="box-sizing: border-box;">fully-Kubernetes-native</em> applications in managing large numbers of objects. These are the bottlenecks I encountered when working with <em style="box-sizing: border-box;">fully Kubernetes-native</em> applications, the readers are welcome to specify more Kubernetes bottlenecks in the comments. </span></p><h2 style="text-align: left;"><span style="font-family: arial;">Fully-Kubernetes-native management applications</span></h2><p style="box-sizing: border-box; caret-color: rgb(36, 41, 46); color: #24292e; font-size: 16px; margin-bottom: 16px; margin-top: 0px;"><span style="font-family: arial;">To facilitate explanation of scalability requirements of the management applications, I present a generic architecture for <em style="box-sizing: border-box;">fully-Kubernetes-native</em> management applications in the following diagram:</span></p><div class="separator" style="clear: both; text-align: left;"><a href="https://blogger.googleusercontent.com/img/a/AVvXsEi2520gxItvK099ydeKffwcLSnY-HfVU9-2hofYKIIuf41kqzoP5Fv7U_nGUdtjn4-KwOWyYUwXUaKGuBhCsZZWyGnYJ0wjBdthQE7VeXJ0EvaZ0iJ2xJfZ3CsQMzSVUhnbiOEzWojwAmfIJy1Cn1lyinYHSb_Buibj6t85YSWnMsYRpv8Dv2EtZtI=s720" style="clear: left; float: left; margin-bottom: 1em; margin-right: 1em;"><span style="font-family: arial;"><img border="0" data-original-height="405" data-original-width="720" height="317" src="https://blogger.googleusercontent.com/img/a/AVvXsEi2520gxItvK099ydeKffwcLSnY-HfVU9-2hofYKIIuf41kqzoP5Fv7U_nGUdtjn4-KwOWyYUwXUaKGuBhCsZZWyGnYJ0wjBdthQE7VeXJ0EvaZ0iJ2xJfZ3CsQMzSVUhnbiOEzWojwAmfIJy1Cn1lyinYHSb_Buibj6t85YSWnMsYRpv8Dv2EtZtI=w564-h317" width="564" /></span></a></div><span style="font-family: arial;"><br /></span><p style="box-sizing: border-box; caret-color: rgb(36, 41, 46); color: #24292e; font-size: 16px; margin-bottom: 16px; margin-top: 0px;"><span style="font-family: arial;"><br /></span></p><p style="box-sizing: border-box; caret-color: rgb(36, 41, 46); color: #24292e; font-size: 16px; margin-bottom: 16px; margin-top: 0px;"><span style="font-family: arial;"><br /></span></p><p style="box-sizing: border-box; caret-color: rgb(36, 41, 46); color: #24292e; font-size: 16px; margin-bottom: 16px; margin-top: 0px;"><span style="font-family: arial;"><br /></span></p><p style="box-sizing: border-box; caret-color: rgb(36, 41, 46); color: #24292e; font-size: 16px; margin-bottom: 16px; margin-top: 0px;"><span style="font-family: arial;"><br /></span></p><p style="box-sizing: border-box; caret-color: rgb(36, 41, 46); color: #24292e; font-size: 16px; margin-bottom: 16px; margin-top: 0px;"><span style="font-family: arial;"><br /></span></p><p style="box-sizing: border-box; caret-color: rgb(36, 41, 46); color: #24292e; font-size: 16px; margin-bottom: 16px; margin-top: 0px;"><span style="font-family: arial;"><br /></span></p><p style="box-sizing: border-box; caret-color: rgb(36, 41, 46); color: #24292e; font-size: 16px; margin-bottom: 16px; margin-top: 0px;"><span style="font-family: arial;"><br /></span></p><p style="box-sizing: border-box; caret-color: rgb(36, 41, 46); color: #24292e; font-size: 16px; margin-bottom: 16px; margin-top: 0px;"><span style="font-family: arial;"><br /></span></p><p style="box-sizing: border-box; caret-color: rgb(36, 41, 46); color: #24292e; font-size: 16px; margin-bottom: 16px; margin-top: 0px;"><span style="font-family: arial;"><br /></span></p><p style="box-sizing: border-box; caret-color: rgb(36, 41, 46); color: #24292e; font-size: 16px; margin-bottom: 16px; margin-top: 0px;"><span style="font-family: arial;"><br /></span></p><p style="box-sizing: border-box; caret-color: rgb(36, 41, 46); color: #24292e; font-size: 16px; margin-bottom: 16px; margin-top: 0px;"><span style="font-family: arial;">In the diagram above, a <em style="box-sizing: border-box;">management application</em> manages some <em style="box-sizing: border-box;">managed entities</em> (sorry for too many "management" words). Examples of <em style="box-sizing: border-box;">managed entities</em> are VMs, bare metal machines, edge devices, other Kubernetes clusters. Each <em style="box-sizing: border-box;">managed entity</em> might have <em style="box-sizing: border-box;">managed subentities</em>. In the case of a VM, examples of <em style="box-sizing: border-box;">managed subentities</em> are network interfaces and disks. In the case of an edge device, an example of <em style="box-sizing: border-box;">managed subentities</em> is mobile applications that run on the device. The <em style="box-sizing: border-box;">management agents</em> run on the <em style="box-sizing: border-box;">managed entities</em> and watch for desired configuration for their <em style="box-sizing: border-box;">managed entities</em>. The desired configuration resides in the a<em style="box-sizing: border-box;">pplication CRs</em> (Kubernetes Custom Resources). The <em style="box-sizing: border-box;">management agents</em> act according to the desired configuration, for example, add a disk or a network interface, deploy a mobile application or configure an existing application. The <em style="box-sizing: border-box;">management agents</em> report the status of the <em style="box-sizing: border-box;">managed entities</em> and of the <em style="box-sizing: border-box;">managed subentities</em> by updating the status field of the CRs or by using dedicated CRs. The <em style="box-sizing: border-box;">management agents</em> get and update the <em style="box-sizing: border-box;">application CRs</em> through the Kubernetes API.</span></p><p style="box-sizing: border-box; caret-color: rgb(36, 41, 46); color: #24292e; font-size: 16px; margin-bottom: 16px; margin-top: 0px;"><span style="font-family: arial;"><span face="-apple-system, BlinkMacSystemFont, "Segoe UI", Helvetica, Arial, sans-serif, "Apple Color Emoji", "Segoe UI Emoji"" style="background-color: white;">The users of the application can provide the desired configuration for the </span><em style="box-sizing: border-box; font-family: -apple-system, BlinkMacSystemFont, "Segoe UI", Helvetica, Arial, sans-serif, "Apple Color Emoji", "Segoe UI Emoji";">managed entities and subentities</em><span face="-apple-system, BlinkMacSystemFont, "Segoe UI", Helvetica, Arial, sans-serif, "Apple Color Emoji", "Segoe UI Emoji"" style="background-color: white;"> by creating or updating </span><em style="box-sizing: border-box; font-family: -apple-system, BlinkMacSystemFont, "Segoe UI", Helvetica, Arial, sans-serif, "Apple Color Emoji", "Segoe UI Emoji";">application CRs</em><span face="-apple-system, BlinkMacSystemFont, "Segoe UI", Helvetica, Arial, sans-serif, "Apple Color Emoji", "Segoe UI Emoji"" style="background-color: white;"> using </span><code style="border-bottom-left-radius: 6px; border-bottom-right-radius: 6px; border-top-left-radius: 6px; border-top-right-radius: 6px; box-sizing: border-box; font-family: SFMono-Regular, Consolas, "Liberation Mono", Menlo, monospace; font-size: 13.600000381469727px; margin: 0px; padding: 0.2em 0.4em;">kubectl</code><span face="-apple-system, BlinkMacSystemFont, "Segoe UI", Helvetica, Arial, sans-serif, "Apple Color Emoji", "Segoe UI Emoji"" style="background-color: white;"> </span><code style="border-bottom-left-radius: 6px; border-bottom-right-radius: 6px; border-top-left-radius: 6px; border-top-right-radius: 6px; box-sizing: border-box; font-family: SFMono-Regular, Consolas, "Liberation Mono", Menlo, monospace; font-size: 13.600000381469727px; margin: 0px; padding: 0.2em 0.4em;">create</code><span face="-apple-system, BlinkMacSystemFont, "Segoe UI", Helvetica, Arial, sans-serif, "Apple Color Emoji", "Segoe UI Emoji"" style="background-color: white;">, </span><code style="border-bottom-left-radius: 6px; border-bottom-right-radius: 6px; border-top-left-radius: 6px; border-top-right-radius: 6px; box-sizing: border-box; font-family: SFMono-Regular, Consolas, "Liberation Mono", Menlo, monospace; font-size: 13.600000381469727px; margin: 0px; padding: 0.2em 0.4em;">apply</code><span face="-apple-system, BlinkMacSystemFont, "Segoe UI", Helvetica, Arial, sans-serif, "Apple Color Emoji", "Segoe UI Emoji"" style="background-color: white;">, </span><code style="border-bottom-left-radius: 6px; border-bottom-right-radius: 6px; border-top-left-radius: 6px; border-top-right-radius: 6px; box-sizing: border-box; font-family: SFMono-Regular, Consolas, "Liberation Mono", Menlo, monospace; font-size: 13.600000381469727px; margin: 0px; padding: 0.2em 0.4em;">edit</code><span face="-apple-system, BlinkMacSystemFont, "Segoe UI", Helvetica, Arial, sans-serif, "Apple Color Emoji", "Segoe UI Emoji"" style="background-color: white;"> commands. The users can get the status of the </span><em style="box-sizing: border-box; font-family: -apple-system, BlinkMacSystemFont, "Segoe UI", Helvetica, Arial, sans-serif, "Apple Color Emoji", "Segoe UI Emoji";">managed entities and subentities </em><span face="-apple-system, BlinkMacSystemFont, "Segoe UI", Helvetica, Arial, sans-serif, "Apple Color Emoji", "Segoe UI Emoji"" style="background-color: white;">by reading </span><em style="box-sizing: border-box; font-family: -apple-system, BlinkMacSystemFont, "Segoe UI", Helvetica, Arial, sans-serif, "Apple Color Emoji", "Segoe UI Emoji";">application CRs</em><span face="-apple-system, BlinkMacSystemFont, "Segoe UI", Helvetica, Arial, sans-serif, "Apple Color Emoji", "Segoe UI Emoji"" style="background-color: white;"> using </span><code style="border-bottom-left-radius: 6px; border-bottom-right-radius: 6px; border-top-left-radius: 6px; border-top-right-radius: 6px; box-sizing: border-box; font-family: SFMono-Regular, Consolas, "Liberation Mono", Menlo, monospace; font-size: 13.600000381469727px; margin: 0px; padding: 0.2em 0.4em;">kubectl get</code>. GitOps and other tools such as Web consoles and dashboards, from the Kubernetes ecosystem or custom ones, can operate the <em style="box-sizing: border-box;">managed entities and subentities </em>through the Kubernetes API. Note that while the <em style="box-sizing: border-box;">Kubernetes API for management</em> and the <em style="box-sizing: border-box;">Kubernetes API for agents</em> is represented in the diagram as different objects, they are served by the same Kubernetes API server and may be identical.</span></p><p style="box-sizing: border-box; caret-color: rgb(36, 41, 46); color: #24292e; font-size: 16px; margin-bottom: 16px; margin-top: 0px;"><span style="font-family: arial;">The users can use a custom Web console to perform management operations by GUI instead of by CLI. <span face="-apple-system, BlinkMacSystemFont, "Segoe UI", Helvetica, Arial, sans-serif, "Apple Color Emoji", "Segoe UI Emoji"" style="background-color: white;">(I omitted the Web console on the diagram above for brevity.) </span>The <em style="box-sizing: border-box;">management application</em> may be implemented as a set of controllers/operators that watch the <em style="box-sizing: border-box;">application CRs</em>. The <em style="box-sizing: border-box;">management application</em> performs reconciliation of the desired state (of <em style="box-sizing: border-box;">managed entities</em> or of <em style="box-sizing: border-box;">managed subentities</em>), and of the actual state reported by the <em style="box-sizing: border-box;">management agents</em>. Note that while the <em style="box-sizing: border-box;">management application</em> and the external tools are depicted on the diagram outside of the Kubernetes cluster, they may run inside the cluster. (They will consume the same Kubernetes API in both cases: running inside or outside the cluster). The <em style="box-sizing: border-box;">management agents</em>, the <em style="box-sizing: border-box;">management application</em> and the external tools may be implemented using standard Kubernetes libraries and may operate according to the controller pattern.</span></p><p style="box-sizing: border-box; caret-color: rgb(36, 41, 46); color: #24292e; font-size: 16px; margin-bottom: 16px; margin-top: 0px;"><span style="font-family: arial;">One of the important aspects of management is security. A compromised agent on one managed entity might access or modify data of other managed entities in the application CRs. In Kubernetes such attacks can be prevented by restricting the permissions of the agent to allow access only to the CRs of the agent's entity. Restricting the permissions of an agent can be accomplished in Kubernetes declaratively by allocating a <a href="https://kubernetes.io/docs/reference/access-authn-authz/service-accounts-admin/">service account</a> to represent the agent and by specifying Kubernetes access control rules for the service account. Alternatively, <a href="https://kubernetes.io/docs/reference/access-authn-authz/webhook/">authorization WebHooks</a> or <a href="https://kubernetes.io/docs/reference/access-authn-authz/admission-controllers/">admission controllers</a> (for control of updates only) might be used, but those require implementing and maintaining additional components.</span></p><p style="text-align: left;"><span style="font-size: medium;"><span style="font-family: arial;">Notice an additional important point in this architecture: the management agents initiate network connections to the Kubernetes cluster that hosts the application CRs, the management application does not initiate network connections to the managed entities. The management agents <i>pull</i> the configuration data, the management application does not <i>push</i> the configuration data to the managed entities directly. Such design facilitates network communication in the case where the managed entities are deployed on <a href="https://www.redhat.com/en/topics/edge-computing/what-is-edge-architecture">the edge</a>:</span><span style="font-family: arial;"><br /></span></span></p><p style="text-align: left;"></p><p style="text-align: left;"></p><ol style="text-align: left;"><li><span style="font-family: arial; font-size: medium;">The managed entities may be intermittently connected to the network. When a managed entity becomes connected, its management agent connects to the Kubernetes cluster, pulls the configuration data for its entity and reports the status back. If the management application would initiate connections to the managed entities, it would have to handle network disconnections in an environment with limited network connectivity.</span></li><li><span style="font-family: arial; font-size: medium;">On the edge, there could be a firewall that prevents initiating network connections from the outside and allows initiating network connections by the managed entities only.</span></li></ol><span style="caret-color: rgb(36, 41, 46); color: #24292e; font-family: arial; font-size: 16px;">If you managed to read until this point, you know what I mean by </span><em style="box-sizing: border-box; caret-color: rgb(36, 41, 46); color: #24292e; font-family: arial; font-size: 16px;">fully Kubernetes-native management applications</em><span style="caret-color: rgb(36, 41, 46); color: #24292e; font-family: arial; font-size: 16px;">. In the following paragraphs I describe Kubernetes scalability bottlenecks and their relevance to the fully Kubernetes-native management applications.</span><p></p><h2 style="text-align: left;"><span style="font-family: arial;">Kubernetes scalability bottlenecks</span></h2><p style="box-sizing: border-box; caret-color: rgb(36, 41, 46); color: #24292e; font-size: 16px; margin-bottom: 16px; margin-top: 0px;"><span style="font-family: arial;">The most obvious scalability bottleneck is <span style="box-sizing: border-box; font-weight: 600;">storage</span>. The etcd documentation <a href="https://etcd.io/docs/v3.5/dev-guide/limit">recommends</a> 8GB as the maximum storage size. It means that an application that manages one million objects can store only up to 8K of data per object. Practically, this limit is lower since etcd stores other cluster data and also a <a href="https://kubernetes.io/docs/reference/using-api/api-concepts/#efficient-detection-of-changes">limited history</a> of the objects.</span></p><p style="box-sizing: border-box; caret-color: rgb(36, 41, 46); color: #24292e; font-size: 16px; margin-bottom: 16px; margin-top: 0px;"><span style="font-family: arial;">If authorization is handled by creating a service account per management agent as desribed above, then the management application needs to allocate one million service accounts to be able to manage one million managed entities in a secure way. The size of a service account in Kubernetes can be on the order of 10KB, which means the management application cannot create one million service accounts using the recommended etcd storage limit of 8GB (10KB*1,000,000 = 10 GB). Compare this storage limit with storage limits of a leading SQL database, <a href="https://www.postgresql.org/docs/14/limits.html">PostgreSQL</a>. The database size is virtually unlimited, a single relation size can be 32TB while a single field size can be 1GB (!).</span></p><p style="box-sizing: border-box; caret-color: rgb(36, 41, 46); color: #24292e; font-size: 16px; margin-bottom: 16px; margin-top: 0px;"><span style="font-family: arial;">A possible solution to the storage limitations of etcd is <a href="https://github.com/k3s-io/kine">Kine</a>. Kine is an etcd shim, an adapter of etcd API to other databases, like MySQL and PostgreSQL. However, even if the storage problem of Kubernetes is solved, there is another bottleneck for Kubernetes scalability, namely <span style="box-sizing: border-box; font-weight: 600;"><a href="https://kubernetes.io/docs/reference/using-api/api-concepts/#single-resource-api">Single-resource API</a></span>. The mutating API verbs, such as <code style="border-bottom-left-radius: 6px; border-bottom-right-radius: 6px; border-top-left-radius: 6px; border-top-right-radius: 6px; box-sizing: border-box; font-size: 13.600000381469727px; margin: 0px; padding: 0.2em 0.4em;">CREATE</code>, <code style="border-bottom-left-radius: 6px; border-bottom-right-radius: 6px; border-top-left-radius: 6px; border-top-right-radius: 6px; box-sizing: border-box; font-size: 13.600000381469727px; margin: 0px; padding: 0.2em 0.4em;">UPDATE</code> and <code style="border-bottom-left-radius: 6px; border-bottom-right-radius: 6px; border-top-left-radius: 6px; border-top-right-radius: 6px; box-sizing: border-box; font-size: 13.600000381469727px; margin: 0px; padding: 0.2em 0.4em;">DELETE</code>, support single resources only. A client that wishes to apply such a verb to many resources must make a separate request for each of those resources.</span></p><p style="box-sizing: border-box; caret-color: rgb(36, 41, 46); color: #24292e; font-size: 16px; margin-bottom: 16px; margin-top: 0px;"><span style="font-family: arial;">Consider the case where the management application needs to rotate one million certificates or update common configuration for one million managed entities. Another case is when a management agent has to update status of multiple subentities of the managed entity. If the management application or a management agent need to update at once a large number of application CRs, tough luck, they must perform the updates one-by-one. Performing updates one-by-one involves a network roundtrip per application CR, including handling network failures and retries per each CR. The issue can be especially acute when connectivity between a management agent and the Kubernetes cluster is limited or there is high latency, for example in case of managing edge devices.</span></p><p style="box-sizing: border-box; caret-color: rgb(36, 41, 46); color: #24292e; font-size: 16px; margin-bottom: 16px; margin-top: 0px;"><span style="font-family: arial;">Compare Kubernetes API with a SQL database: in SQL you can insert multiple rows in a single <a href="https://www.postgresql.org/docs/14/sql-insert.html">INSERT</a> statement, you can update multiple rows by an <a href="https://www.postgresql.org/docs/14/sql-update.html">UPDATE ... WHERE</a> statement selecting multiple rows to be updated by the <code style="border-bottom-left-radius: 6px; border-bottom-right-radius: 6px; border-top-left-radius: 6px; border-top-right-radius: 6px; box-sizing: border-box; font-size: 13.600000381469727px; margin: 0px; padding: 0.2em 0.4em;">WHERE</code> clause. There are also batch operations for sending multiple INSERT and UPDATE commands in a batch. Some SQL databases even have binary bulk operations, where multiple inserts can be parsed into a binary blob on the client side and sent to the server as binary. See for example <em style="box-sizing: border-box;">batch queries</em> and <em style="box-sizing: border-box;">COPY protocol support for faster bulk data loads</em> in the <a href="https://github.com/jackc/pgx#features">pgx</a> Go driver for PostgreSQL.</span></p><p style="box-sizing: border-box; caret-color: rgb(36, 41, 46); color: #24292e; margin-bottom: 16px; margin-top: 0px;"><span><span style="font-family: arial; font-size: 16px;">Yet another Kubernetes API problem related to scalability is </span><span style="box-sizing: border-box; font-family: arial; font-size: 16px; font-weight: 600;"><a href="https://github.com/kubernetes/kubernetes/issues/80602">lack of ability to specify sort order</a></span><span style="font-family: arial; font-size: 16px;"> of the returned lists of objects. The users of </span><code style="border-bottom-left-radius: 6px; border-bottom-right-radius: 6px; border-top-left-radius: 6px; border-top-right-radius: 6px; box-sizing: border-box; margin: 0px; padding: 0.2em 0.4em;"><span style="font-family: courier;">kubectl</span></code><span style="font-family: arial; font-size: 16px;"> can specify sort order using -</span><a href="https://kubernetes.io/docs/reference/generated/kubectl/kubectl-commands#get" style="font-family: arial; font-size: 16px;">-sort-by</a><span style="font-family: arial; font-size: 16px;"> flag. However, sorting is performed on the client side by </span><span style="font-family: courier; font-size: 16px;">kubectl</span><span style="font-family: arial; font-size: 16px;">. In a similar way other tools may implement client-side sorting for various sort orders. Fetching one million objects and sorting them by clients may strain computational resources (memory and CPU) of the clients and waste bandwidth, especially in the case where sorting is used with pagination. Consider the case where some user wants to see top ten objects out of one million, according to some criteria. In this case the client (for example a Web browser) must fetch all the million and find the first ten objects in the requested sorting order. Alternatively, the clients could use some proxy on top of Kubernetes API, and let this proxy component perform caching, sorting and pagination for the clients. This would require additional effort of development and maintenance of the proxy component, and would waste computational resources required to run the proxy component.</span></span></p><p style="box-sizing: border-box; caret-color: rgb(36, 41, 46); color: #24292e; font-size: 16px; margin-bottom: 16px; margin-top: 0px;"><span style="font-family: arial;">Another possible performance bottleneck of Kubernetes API is <span style="box-sizing: border-box; font-weight: 600;"><a href="https://kubernetes.io/docs/reference/using-api/api-concepts/#alternate-representations-of-resources">lack of protocol-buffers support for CRDs</a></span>. <a href="https://developers.google.com/protocol-buffers">Protocol buffers</a> with <a href="https://grpc.io">gRPC</a> may provide <a href="https://blog.dreamfactory.com/grpc-vs-rest-how-does-grpc-compare-with-traditional-rest-apis/">7 to 10 times faster message transmission</a> comparing to a REST API.</span></p><p style="box-sizing: border-box; caret-color: rgb(36, 41, 46); color: #24292e; font-size: 16px; margin-bottom: 16px; margin-top: 0px;"><span style="font-family: arial;">One more scalability issue with Kubernetes ecosystem is <span style="box-sizing: border-box; font-weight: 600;"><a href="https://github.com/kubernetes-sigs/controller-runtime/issues/1456#issuecomment-871006968">lack of a built-in mechanism for load balancing between replicas of controllers</a></span>. There is no built-in mechanism in Kubernetes to distribute processing of changes of CRs to multiple replicas of the same controller. The controller replicas usually perform <a href="https://github.com/kubernetes-sigs/controller-runtime/blob/d887b2f1c68fee51b20378ff2a0bd6a8a77674dd/pkg/manager/manager.go#L140">leader election</a> and the elected leader performs all the reconciliation.</span></p><p style="box-sizing: border-box; caret-color: rgb(36, 41, 46); color: #24292e; font-size: 16px; margin-bottom: 16px; margin-top: 0px;"><span style="font-family: arial;">A side note on controller concurrency: by default, the controllers that use <a href="https://github.com/kubernetes-sigs/controller-runtime">controller-runtime</a> do not perform reconciliation concurrently (the <a href="https://github.com/kubernetes-sigs/controller-runtime/blob/d887b2f1c68fee51b20378ff2a0bd6a8a77674dd/pkg/controller/controller.go#L38">MaxConcurrentReconciles</a> option is 1). Also, the <a href="https://github.com/kubernetes/client-go/blob/2f52a105e63e9ac7affb1a0f533c62883611ba81/util/workqueue/default_rate_limiters.go#L42">work queue retry rate limit</a> (in case of multiple errors) is by default 10 QPS only. To understand rate-limiting in controller runtime and Kubernetes go client, check <a href="https://danielmangum.com/posts/controller-runtime-client-go-rate-limiting/">this great article</a>.</span></p><p style="box-sizing: border-box; caret-color: rgb(36, 41, 46); color: #24292e; font-size: 16px; margin-bottom: 16px; margin-top: 0px;"><span style="font-family: arial;">While the defaults mentioned above can be changed by controller developers to increase concurrency, reconciling one million application CRs can strain the computational resources of a single controller. The developers of the management application must implement custom load balancing solutions, wasting development and maintenance effort, if they want to use the controller pattern with one million CRs. Note that various tools in Kubernetes ecosystem are implemented as controllers, for example <a href="https://argo-cd.readthedocs.io/en/stable/#architecture">ArgoCD</a> and <a href="https://fluxcd.io/docs/components/">Flux</a>.</span></p><p style="box-sizing: border-box; caret-color: rgb(36, 41, 46); color: #24292e; font-size: 16px; margin-bottom: 16px; margin-top: 0px;"><span style="font-family: arial;">A general problem with Kubernetes clients, commonly but not exclusively appearing in controllers, is the <span style="box-sizing: border-box; font-weight: 600;">local caches</span> maintained by <a href="https://aly.arriqaaq.com/kubernetes-informers/">informers</a> (commonly used in building controllers). An informer maintains a local cache of all the objects in its purview, which is problematic when that data volume is large, as reported <a href="https://github.com/kubernetes/client-go/issues/871">here</a> and <a href="https://github.com/kyverno/kyverno/issues/1832">here</a>.</span></p><p style="box-sizing: border-box; caret-color: rgb(36, 41, 46); color: #24292e; font-size: 16px; margin-bottom: 16px; margin-top: 0px;"><span style="font-family: arial;">Another major scalability bottleneck in Kubernetes is <span style="box-sizing: border-box; font-weight: 600;">authorization</span>. Consider the following use case: a user of an application that manages one million entities must be allowed to access only ten entities out of the million. Such requirement can be specified using <a href="https://kubernetes.io/docs/reference/access-authn-authz/rbac/">Kubernetes RBAC</a> as one of the <a href="https://kubernetes.io/docs/reference/access-authn-authz/authorization/#authorization-modules">Kubernetes authorization modes</a>, by providing GET access to the ten CRs of the allowed managed entities and by forbidding LIST access to the managed entity CRD. (If the LIST access is granted, the user will be able to GET all the managed entities). However, with such authorization, the user cannot GET the ten CRs they are allowed to GET, without specifying all the ten CRs explicitly by their names. This is because Kubernetes API's LIST operation <a href="https://github.com/kubernetes/kubernetes/issues/54079">does not perform filtering based on authorization</a>. The clients can either get all the CRs or some specific CRs. Moreover, there is no API to ask Kubernetes which objects some user is allowed to access. Kubernetes clients can <a href="https://kubernetes.io/docs/reference/access-authn-authz/authorization/#checking-api-access">query Kubernetes authorization API</a> and inquire whether a given user can access some specific object or whether a given user can access all the objects of some kind. Practically it means that if some client tool, like GitOps or UI, needs to process objects on behalf of a user who has access to ten objects out of one million, this tool must query Kubernetes API one million times, once per each object, to filter the objects the user is allowed to access. That would be a major performance bottleneck (one million network calls of the authorization API). A security issue with the scenario above is that the tool must be authorized to LIST all the objects.</span></p><p style="box-sizing: border-box; caret-color: rgb(36, 41, 46); color: #24292e; font-size: 16px; margin-bottom: 16px; margin-top: 0px;"><span style="font-family: arial;">A solution to the problem above could be creating an <em style="box-sizing: border-box;">authorization cache</em> that continuously calculates all the authorization decisions and caches them. Such a cache can be combined with the proxy mentioned above, used for sorting and pagination. Implementing and maintaining such custom authorization/sorting/pagination cache and proxy would require significant development effort.</span></p><h2 style="text-align: left;"><span style="font-family: arial;">Summary</span></h2><p style="box-sizing: border-box; caret-color: rgb(36, 41, 46); color: #24292e; font-size: 16px; margin-bottom: 16px; margin-top: 0px;"><span style="font-family: arial;">To summarize, let me list the Kubernetes scalability bottlenecks examined above:</span></p><ul style="box-sizing: border-box; caret-color: rgb(36, 41, 46); color: #24292e; font-size: 16px; margin-bottom: 16px; margin-top: 0px; padding-left: 2em;"><li style="box-sizing: border-box;"><span style="font-family: arial;">Storage</span></li><li style="box-sizing: border-box; margin-top: 0.25em;"><span style="font-family: arial;">Single-resource API</span></li><li style="box-sizing: border-box; margin-top: 0.25em;"><span style="font-family: arial;">No sort order for server-side sorting</span></li><li style="box-sizing: border-box; margin-top: 0.25em;"><span style="font-family: arial;">No protocol-buffers support for CRDs</span></li><li style="box-sizing: border-box; margin-top: 0.25em;"><span style="font-family: arial;">No load balancing between replicas of controllers</span></li><li style="box-sizing: border-box; margin-top: 0.25em;"><span style="font-family: arial;">Extensive client-side caching</span></li><li style="box-sizing: border-box; margin-top: 0.25em;"><span style="font-family: arial;">No filtering by authorization</span></li><li style="box-sizing: border-box; margin-top: 0.25em;"><a href="https://en.wikipedia.org/wiki/There_are_known_knowns"><span style="font-family: arial;">Unknown unknowns</span></a></li></ul><p style="box-sizing: border-box; caret-color: rgb(36, 41, 46); color: #24292e; margin-bottom: 16px; margin-top: 0px;"><span style="font-size: medium;"><span style="font-family: arial;">The last item in the list above relates to unknown bottlenecks that could be discovered once all other problems in the list are solved. There is an evidence of large-scale fully-Kubernetes-native management from </span><a href="https://github.com/rancher/fleet" style="font-family: arial;">Rancher Fleet</a><span style="font-family: arial;"> that managed to (sorry again for too many "management" words) import one million managed entities (Kubernetes clusters in that case). </span><a href="https://www.suse.com/c/rancher_blog/scaling-fleet-and-kubernetes-to-a-million-clusters/" style="font-family: arial;">This blog post</a><span style="font-family: arial;"> describes that experiment and, in particular, using Kine to overcome the storage problems. </span><span><span style="font-family: arial;">However, the authors of the blog post did not provide details on how effective was performing the actual management tasks after importing, initial discovery of deployments and reporting the status back. It would be interesting to know how much time did it take to deploy a new application to one million managed clusters, was the UI able to perform sorting and pagination of one million managed clusters, how effective was using </span><span style="font-family: courier;">kubectl</span><span style="font-family: arial;"> to work with one million managed clusters.</span></span></span></p><p style="box-sizing: border-box; caret-color: rgb(36, 41, 46); color: #24292e; font-size: 16px; margin-bottom: 0px; margin-top: 0px;"><span style="font-family: arial;">In my opinion, in order to support <em style="box-sizing: border-box;">fully-Kubernetes-native management</em> at large scale, without requiring custom solutions for storage, caching, sorting, load balancing and authorization, Kubernetes must be significantly changed.</span></p><p style="box-sizing: border-box; caret-color: rgb(36, 41, 46); color: #24292e; font-size: 16px; margin-bottom: 0px; margin-top: 0px;"><span style="font-family: arial;"><br /></span></p><p style="box-sizing: border-box; margin-bottom: 0px; margin-top: 0px;"><i><span style="font-family: arial; font-size: medium;"><span style="color: #24292e;"><span style="caret-color: rgb(36, 41, 46);">I would like to thank </span></span><a href="https://github.com/MikeSpreitzer">Mike Spreitzer</a><span style="color: #24292e;"><span style="caret-color: rgb(36, 41, 46);"> for reviewing this blog post and for providing enlightening comments, and for great discussions we had </span></span></span></i><i><span style="font-family: arial; font-size: medium;"><span style="color: #24292e;">about Kubernetes.</span></span></i><i><span style="font-family: arial; font-size: medium;"><span style="color: #24292e;"><span style="caret-color: rgb(36, 41, 46);"> Thanks to <a href="https://github.com/vMaroon">Maroon Ayoub</a> for his review.</span></span></span></i></p>Vadim Eisenberghttp://www.blogger.com/profile/15064590545159507001noreply@blogger.com0tag:blogger.com,1999:blog-614715351470973328.post-23806125772418793152019-03-26T10:09:00.001-07:002022-03-19T02:32:01.946-07:00<div dir="ltr" style="text-align: left;" trbidi="on">
<h1 style="text-align: left;"><span style="font-weight: normal;">Checklist: pros and cons of using multiple Kubernetes clusters, and how to distribute workloads between them</span></h1>
<div>
<br /></div>
<div class="graf graf--p" name="2a7a">
Here is a list of pros and cons I found for using multiple clusters vs. a single one.</div>
<div class="graf graf--p" name="2a7a">
<br /></div>
<h3 style="text-align: left;">
<span class="markup--strong markup--p-strong" style="font-weight: normal;">Reasons to have multiple clusters</span></h3>
<ul class="postList">
<li class="graf graf--li" name="b4c0">Scalability limits, for example a Kubernetes cluster has <a class="markup--anchor markup--li-anchor" data-href="https://kubernetes.io/docs/setup/cluster-large/" href="https://kubernetes.io/docs/setup/cluster-large/" rel="noopener" target="_blank">a limit of 150,000 pods</a>. An OpenShift cluster has <a class="markup--anchor markup--li-anchor" data-href="https://docs.openshift.com/container-platform/3.11/scaling_performance/cluster_limits.html#scaling-performance-current-cluster-limits" href="https://docs.openshift.com/container-platform/3.11/scaling_performance/cluster_limits.html#scaling-performance-current-cluster-limits" rel="noopener" target="_blank">a limit of 10,000 services</a>.</li>
<li class="graf graf--li" name="9c94">Separation of production/development/test<br />especially for testing a new version of Kubernetes, of a service mesh, of other cluster software.</li>
<li class="graf graf--li" name="0e29">Compliance<br />according to some regulations some applications must run in separate clusters/separate VPNs.</li>
<li class="graf graf--li" name="103e">Multi-vendor<br />to prevent vendor lock-in running clusters of multiple providers.</li>
<li class="graf graf--li" name="1930">Cloud/on-prem<br />to split the load between on-premise services.</li>
<li class="graf graf--li" name="12b7">Regionality for latency<br />run clusters in different geographical regions to reduce latency in those regions.</li>
<li class="graf graf--li" name="d472">Regionality for availability<br />run in clusters in different regions/availability zones to reduce damage of a failing datacenter/region.</li>
<li class="graf graf--li" name="5823">Better isolation for security</li>
<li class="graf graf--li" name="2c3b">Isolation for easier billing/resource allocation</li>
</ul>
<h3 style="text-align: left;">
<span class="markup--strong markup--p-strong" style="font-weight: normal;">Reasons to have a single cluster</span></h3>
<ul class="postList">
<li class="graf graf--li" name="8e5e">Reduce setup, maintenance and administration overhead</li>
<li class="graf graf--li" name="cc50">Improve utilization</li>
<li class="graf graf--li" name="411f">Reduce latency between applications in multiple clusters</li>
<li class="graf graf--li" name="411f">Cost reduction</li>
</ul>
<h3 style="text-align: left;">
<span class="markup--strong markup--p-strong" style="font-weight: normal;">How to allocate workloads to clusters</span></h3>
<ul class="postList">
<li class="graf graf--li" name="81c3">Compliance<br />some applications must run on separate clusters.</li>
<li class="graf graf--li" name="444f">Locality for latency<br />allocate the applications according to the regions, to reduce latency.</li>
<li class="graf graf--li" name="1bb3">Billing/Quotas<br />allocate applications together per billing account, to facilitate billing/quota enforcement.</li>
<li class="graf graf--li" name="c1c3">Maintainability<br />put the applications in the same cluster when it makes sense to perform maintenance of the cluster for all them (upgrading Kubernetes version, etc.).</li>
<li class="graf graf--li" name="2ae0">Hardware requirements<br />allocate high-performance applications to clusters with hardware for high performance.</li>
<li class="graf graf--li" name="d33e">Dependencies<br />reduce the need in intra-cluster service registries by allocating dependent applications together.</li>
<li class="graf graf--li" name="2456">Identity and Access management<br />allocate applications in such a way that in-cluster identity and access management would suffice</li>
<li class="graf graf--li" name="2733">Monitoring, tracing, logging<br />allocate applications to reduce the need for distributed monitoring, tracing, logging.</li>
</ul>
<h3 style="text-align: left;">
<span class="markup--strong markup--p-strong" style="font-weight: normal;">Sources:</span></h3>
<ul class="postList">
<li class="graf graf--li" name="558a"><a href="https://kubernetes.io/docs/setup/cluster-large/">https://kubernetes.io/docs/setup/cluster-large/</a></li>
<li class="graf graf--li" name="9f51"><a class="markup--anchor markup--li-anchor" data-href="https://docs.openshift.com/container-platform/3.11/scaling_performance/cluster_limits.html#scaling-performance-current-cluster-limits" href="https://docs.openshift.com/container-platform/3.11/scaling_performance/cluster_limits.html#scaling-performance-current-cluster-limits" rel="noreferrer nofollow noopener" target="_blank">https://docs.openshift.com/container-platform/3.11/scaling_performance/cluster_limits.html#scaling-performance-current-cluster-limits</a></li>
<li class="graf graf--li" name="813f"><a class="markup--anchor markup--li-anchor" data-href="https://kubernetes.io/docs/concepts/cluster-administration/federation/" href="https://kubernetes.io/docs/concepts/cluster-administration/federation/" rel="noreferrer nofollow noopener" target="_blank">https://kubernetes.io/docs/concepts/cluster-administration/federation/</a></li>
<li class="graf graf--li" name="f18a"><a class="markup--anchor markup--li-anchor" data-href="https://cloud.google.com/solutions/scope-and-size-kubernetes-engine-clusters" href="https://cloud.google.com/solutions/scope-and-size-kubernetes-engine-clusters" rel="noreferrer nofollow noopener" target="_blank">https://cloud.google.com/solutions/scope-and-size-kubernetes-engine-clusters</a></li>
<li class="graf graf--li" name="7df8"><a class="markup--anchor markup--li-anchor" data-href="https://cloud.google.com/blog/products/gcp/kubernetes-best-practices-organizing-with-namespaces" href="https://cloud.google.com/blog/products/gcp/kubernetes-best-practices-organizing-with-namespaces" rel="noreferrer nofollow noopener" target="_blank">https://cloud.google.com/blog/products/gcp/kubernetes-best-practices-organizing-with-namespaces</a></li>
</ul>
</div>
Vadim Eisenberghttp://www.blogger.com/profile/15064590545159507001noreply@blogger.com0tag:blogger.com,1999:blog-614715351470973328.post-19535736156326863952011-10-29T00:23:00.000-07:002011-12-20T22:52:07.965-08:00On the difference between Linked Data and Semantic WebAfter being confused for some time about the difference between Linked Data and Semantic Web and after reading some resources about the both concepts, I would like to share my interpretation of what I read. <br />
<br />
Semantic Web is a vision of (among some other things) creating a Web of Data. Linked Data is a concrete means to achieve (a lightweight version of) that vision. I will explain later what I mean by the lightweight version. For now, Linked Data can be seen like a reference implementation of the Semantic Web, one of several possible implementations. Semantic Web is <i>What</i> and Linked Data is <i>How</i>.<br />
<br />
<div></div>According to <a href="http://linkeddatabook.com/editions/1.0/">the Linked Data book</a> :<i> "Linked Data provides a publishing paradigm in which not only documents, but also data, can be a first class citizen of the Web, thereby enabling the extension of the Web with a global data space based on open standards - the Web of Data."</i><br />
<br />
<div></div>According to<a href="http://www.w3.org/standards/semanticweb/data"> the W3C Linked Data page</a>: <i>"The Semantic Web is a Web of Data... to make the Web of Data a reality, it is important to have the huge amount of data on the Web available in a standard format, reachable and manageable by Semantic Web tools. Furthermore, not only does the Semantic Web need access to data, but relationships among data should be made available, too, to create a Web of Data (as opposed to a sheer collection of datasets). This collection of interrelated datasets on the Web can also be referred to as Linked Data.</i><br />
<i>...</i><br />
<i>Linked Data lies at the heart of what Semantic Web is all about: large scale integration of, and reasoning on, data on the Web."</i><br />
<br />
<div></div>According to <a href="http://tomheath.com/papers/bizer-heath-berners-lee-ijswis-linked-data.pdf">the article of Chris Bizer, Tom Heath and TimBL, Linked Data - the Story so far</a> : "<i>... while the Semantic Web, or Web of Data, is the goal or the end result of this process, Linked Data </i><i>provides the means to reach that goal. ... Over time, with Linked Data as a foundation, some of the more sophisticated proposals associated with the Semantic W</i><i>eb vision, such as intelligent agents, may become a reality."</i><br />
<br />
<div></div>So Linked Data constitutes a paradigm of publishing data sets on the Web in order to achieve the goal of creating a Web of Data - part of the vision of the Semantic Web. The published interrelated data sets themselves are also referred as Linked Data.<br />
<br />
There are, however, at least two differences between the original vision of Semantic Web and the vision Linked Data principles facilitate to achieve.<br />
<br />
The first difference is about the usage of URIs. According to Linked Data principles URIs have to be dereferenceable, while there is no such requirement for RDF. In the citations below, the bold font is applied by me.<br />
<br />
<a href="http://www.w3.org/TR/rdf-primer/#rdfmodel">RDF Primer:</a> <br />
<i>In addition, <b>sometimes</b> an organization will use a vocabulary's namespace URIref as the URL of a Web resource that provides further information about that vocabulary... Accessing ... namespace URIref in a Web browser will retrieve additional information about the ... vocabulary... However, this is also <b>just a convention</b>. RDF does not assume that a namespace URI identifies a retrievable Web resource</i> <br />
<br />
<a href="http://linkeddatabook.com/editions/1.0/#htoc72">The Linked Data book:</a><br />
<i>The primary means of publishing Linked Data on the Web is by making URIs dereferenceable, thereby enabling the follow-your-nose style of data discovery. This should be considered the <b>minimal requirements</b> for Linked Data publishing.</i><br />
<br />
The second difference is about ontological axioms. According to <a href="http://linkeddatabook.com/editions/1.0/">the Linked Data book</a> <a href="http://linkeddatabook.com/editions/1.0/#htoc55">ontological axioms should be used sparingly</a>: <br />
<br />
<i>"Only define things that matter – for example, defining domains and ranges helps clarify how properties should be used, but over-specifying a vocabulary can also produce unexpected inferences when the data is consumed. Thus you should not overload vocabularies with ontological axioms, but better define terms rather loosely (for instance, by using only the RDFS and OWL terms introduced above). " </i><br />
<br />
The RDFS and OWL terms introduced in <a href="http://linkeddatabook.com/editions/1.0/">the Linked Data book</a> are :<br />
<ul><li>rdf:type </li>
<li>rdfs:Class</li>
<li>rdfs:Property</li>
<li>rdfs:subClassOf</li>
<li>rdfs:subPropertyOf</li>
<li>rdfs:domain</li>
<li>rdfs:range</li>
<li>rdfs:label</li>
<li>rdfs:comment</li>
<li>owl:Ontology</li>
<li>owl:ObjectProperty</li>
<li>owl:inverseOf</li>
<li>owl:equivalentClass</li>
<li>owl:equivalentProperty</li>
<li>owl:inverseFunctionalProperty</li>
</ul>So this is what I meant by writing that Linked Data is a means to reach a <i>lightweight version</i> of Semantic Web - the Web of Data with limited use of ontologies and knowledge representation.<br />
<br />
One might ask where the use of RDFS and OWL appears in the Linked Data principles. It is actually in <a href="http://www.w3.org/DesignIssues/LinkedData.html">the principle 3</a>: <i>"When someone looks up a <small>URI</small>, provide useful information, using the standards (RDF*, SPARQL)" </i><br />
<br />
Once you use URIs for RDF properties, looking up the properties should provide an information about the properties - information expressed by RDFS and OWL.<br />
<br />
In <a href="http://www.w3.org/2008/Talks/0617-lod-tbl/">this talk about Linked Data</a> TimBL mentions using <a href="http://www.w3.org/2008/Talks/0617-lod-tbl/#%286%29">Ontology bits for basic inference</a> : <i>"Inference - smarter query"</i> and <a href="http://www.w3.org/2008/Talks/0617-lod-tbl/#%287%29">Ontology bits for Validation and Constraining input</a> : <i>"... mistakes to be spotted... user input menus to be constrained". </i><br />
<br />
Note the words <i>"basic" </i>and <i>"bits"</i>. <br />
<br />
<a href="http://www-sop.inria.fr/acacia/cours/essi2006/Scientific%20American_%20Feature%20Article_%20The%20Semantic%20Web_%20May%202001.pdf">The seminal paper "The Semantic Web" in Scientific American from 2001</a> talked about inference rules in the ontologies, for example:<br />
<i>Inference rules in ontologies supply further power. An ontology may express the rule "If a city code is associated with a state code, and an address uses that city code, then that address has the associated state code." </i><br />
<br />
It seems that TimBL too, is now in favor of achieving (first) the limited version of the Semantic Web by the Linked Data principles - less ontological axioms, less knowledge management, less semantics. Maybe this attitude is aligned with <a href="http://www.w3.org/2001/tag/doc/leastPower">the Rule of Least Power</a>:<br />
<br />
<i>Principle: Powerful languages inhibit information reuse.</i><br />
<i>... </i><br />
<i>Good Practice: Use the least powerful language suitable for expressing information, constraints or programs on the World Wide Web.</i><br />
<br />
So, using ontological axioms except from those mentioned above is probably not required for creating the Web of Data. This is why their usage is discouraged by the Linked Data book - they provide more power than needed. A more powerful use of ontologies might be labeled as the <i>Web of Knowledge</i> or <i>Linked Ontologies</i> or <i>Linked Knowledge</i> as opposed to the Web of Data and Linked Data. The original Semantic Web vision probably was to create both Web of Data and Web of Knowledge (Web of Ontologies). The goal of Linked Data paradigm is to achieve Web of Data only.Vadim Eisenberghttp://www.blogger.com/profile/15064590545159507001noreply@blogger.com0