Kubernetes Ingress Services

<< Back to Technical Glossary

Kubernetes Ingress Services Definition

In Kubernetes, an ingress is an API object that manages external IP access to services in a cluster, typically exposing outside HTTP and HTTPS routes to services inside the cluster. A set of rules the ingress resource defines controls traffic routing. Ingress may be configured to offer services SSL/TLS termination, traffic load balancing, externally-reachable URLs, and virtual, name-based hosting.

Typically an Ingress controller, usually with a load balancer, is responsible for fulfilling the ingress, though it may also configure additional frontends such as an edge router to assist in managing traffic. Ingress allows for traffic routing and access without exposing every node on the service or creating multiple load balancers.

An Ingress does not expose random protocols or ports. To expose services beyond HTTP and HTTPS to the internet, generally a nodeport or load balancer are required.

Image shows traffic moving through ingress controller and being dispersed amongst 2 services before pods accept connections.

Kubernetes Ingress Services FAQs

What is Kubernetes Ingress Services?

A Kubernetes cluster is made up of node machines. Inside pods grouped based on the type of service they provide, these machines run containerized applications. A set of pods must be capable of accepting connections from inside or outside the cluster. Yet both internal and external access are challenging.

The node’s IP address cannot access pods inside the cluster for external access, yet users must communicate with the application using the node’s IP address. And while each pod in the system has its own unique Pod IP assigned, these are unreliable for internal communication because they are not static. This means persistent internal and external access for the cluster is critical.

A service is a Kubernetes object that acts as a stable address for pods, serving as an endpoint enabling communication between components inside and outside the application.The three important Kubernetes service types are:

  • ClusterIP
  • NodePort
  • Load Balancer


These all allow users to expose a service to external network requests and send requests from outside the Kubernetes cluster to services inside the cluster.


The default ingress service in Kubernetes is ClusterIP. Unless the user manually defines another type, a service will be exposed on a ClusterIP.


Declare the NodePort configuration setting in a service YAML. This causes Kubernetes to allocate and forward any request to the cluster to a specific port on each node.

It is simple to manage services with a NodePort. The Kubernetes API assigns requests to random TCP ports and exposes them outside the cluster. This is convenient because a client can target any cluster node via one port and ensure that messages land.

However, this is also a less robust method. There is no way to know to which port a service will be allocated, and ports can be re-allocated. Furthermore, port values must fall between 30000 and 32767, a range that is both out of range of well-known ports and non-standard compared to familiar 80 and 443 ports for HTTP and HTTPS. The randomness itself presents a challenge, particularly for configuring firewall rules, NAT, etc. when a different port is randomly set for each service.

Load Balancer

By specifying the property type in the service’s YAML the same way the NodePort is set, users can set a service to be the load balancer type. A load balancer spreads workloads evenly across Kubernetes clusters.

Kubernetes service ingress load balancers sit between servers and the internet, connecting users and exposing services to the outside world, while also providing failover. Should a server fail, the load balancer reduces the effect on users by redirecting the workload to a backup server, and sending user requests to whichever servers are available. They add servers when demand is high, and drop them when demand falls. However, many services demand high numbers of load balancers, because each service requires its own.

However, there are prerequisites for this approach, which depends on the functionality of an external load balancer in the cluster, typically via a cloud provider. Each cloud-hosted environment such as Amazon’s EKS and Google’s GKE spins up its own hosted load balancer technology, along with a new public IP address.

While users may expose a service with Nodeport and LoadBalancer by specifying a value in the service’s type, relative to the service and service type, Kubernetes ingress service is a completely independent resource. Isolated and decoupled from services to expose, users create, declare, and destroy ingress separate from services with rules consolidated in one location.

Just like any other application, ingress controllers are pods. This means they can see other sets of pods and are part of the cluster. Ingress controllers are built using underlying reverse proxies with various performance specifications and features, such as Layer 7 routing and load balancing capabilities.

Like other Kubernetes pods inside the cluster, ingress controllers are susceptible to the same rules for exposure via a Service: they require either a LoadBalancer or NodePort for access. However, to contrast Kubernetes service load balancers vs ingress controllers, the controller can route traffic through one single service and then connect to many internal pods. The ingress controller can inspect HTTP requests and direct them to the correct pods based on the domain name, the URL path, or other observed characteristics.

Ingress service in Kubernetes takes a declarative approach that allows users to specify what they want but leaves fulfillment to the Ingress Controller, which configures its underlying proxy to enact any new ingress rules and corresponding routes it detects.

The need to configure an ingress controller for a Kubernetes cluster is a disadvantage to this approach. However, this is simple with the VMware NSX Advanced Load Balancer Kubernetes ingress controller, the Nginx ingress controller, or another Kubernetes ingress services solution. Many of the options on the market are open-source, and many are compatible with each other. In other words, there is no single Azure Kubernetes service ingress controller, or Google GKE Kubernetes ingress controller, but instead a range of potential compatible options.

Kubernetes Ingress vs Service

There is a difference between ingress and service in Kubernetes. A Kubernetes service is a logical abstraction for a group of pods which perform the same function deployed in a cluster. A service enables a group of pods, which are ephemeral, to provide specific functions such as image processing or web services, with an assigned name and ClusterIP, a unique IP address.

By using a cluster-internal IP, Kubernetes’ default service type, ClusterIP limits access to the service to within the cluster. ClusterIP allows pods to communicate with each other inside a cluster, only.

Users can also set NodePort as the service type as another Kubernetes networking option. NodePort allows users to configure load balancers to support environments Kubernetes does not support completely. NodePort opens up a way for external traffic to reach the nodes by exposing their IP addresses.

​​This brings us to the difference between service and ingress in Kubernetes. An ingress vs service in Kubernetes is an API object that manages outside access to Kubernetes cluster services, generally via HTTPS/HTTP, by providing routing rules. Users can consolidate routing rules into a single custom resource and create a set of rules for moving traffic without multiple load balancers or exposing each service on the node with ingress routes.

Comparing the NodePort or LoadBalancer Kubernetes services vs ingress, ingress is an entry point that rests in front of multiple services in the cluster, not a type of service. NodePorts are simple to use and convenient, especially at the dev/test stage, but compared to Kubernetes ingress services they have many weaknesses. Here are some of the other Kubernetes ingress and service differences:

  • To reach the service, clients must know node IP addresses; ingress makes hiding cluster internals such as numbers and IP addresses of nodes and using a DNS entry simpler
  • NodePort increases the complexity of secure Kubernetes cluster management because it demands opening external port access to each node in the cluster for every service
  • Using NodePort with a proliferation of clusters and services, reasoning about network access and troubleshooting rapidly become very complex


Finally, consider the difference between a Kubernetes deployment vs service vs ingress. A Kubernetes deployment refers to the file that defines the desired characteristics or behavior of the pod itself. For example, a Kubernetes deployment might instruct Kubernetes to automatically upgrade the pod’s container image version so there’s no need to do it manually. A Kubernetes deployment is a management tool for pods.

Does VMware NSX Advanced Load Balancer Offer Kubernetes Ingress Service Discovery?

Yes. VMware NSX Advanced Load Balancer’s advanced platform serves as a Kubernetes ingress controller with advanced application services. Appliance-based load balancing solutions are obsolete in the face of modern, microservices-based application architectures. Containerized applications deployed in Kubernetes clusters need enterprise-class, scalable Kubernetes Ingress Services for global and local traffic management, load balancing, monitoring/analytics, service discovery, and security.

VMware NSX Advanced Load Balancer’s advanced Kubernetes ingress controller delivers enterprise-grade features, multi-cloud application services, high levels of machine learning-based automation, and observability—all designed to help bring container-based applications into enterprise production environments. Learn more here.

For more on the actual implementation of load balancers, check out our Application Delivery How-To Videos.

Kubernetes Service Discovery

<< Back to Technical Glossary

Kubernetes Service Discovery Definition

Kubernetes is a platform for container orchestration that consists of a group of well-documented, API-oriented fast binaries that are simple foundations for building applications. A Pod is a basic Kubernetes building block. This Kubernetes resource represents a collection of containers that users can create and destroy as needed.

Any internal IP addresses assigned to a Pod can change over time because the Kubernetes cluster scheduler can move or reschedule Pods to other Kubernetes cluster nodes. This presents a new problem: when a Pod moves to a new node, the connection based on its internal IP address stops working, and so it can no longer be used to access the application.

A new layer of abstraction called a Service Deployment allows Pods to remain accessible via other Kubernetes clusters or external networks without relying on internal IPs. Kubernetes service meshes further reduce the challenges presented by service and container sprawl in a microservices environment by automating and standardizing communication between services.

Services allow Pods that work uniformly in clusters to connect across the network. Service discovery is the process of connecting pods and services.

There are two options for discovering services internal to Kubernetes:

  • DNS discovery: Kubernetes provides a CoreDNS server and kube-DNS as an add-on resource, and every service registers with the DNS server so they can interact and communicate.
  • Environment variables: Kubernetes can import environment variables from older services when it creates new pods, enabling pod communication.


There are two options for Kubernetes external service discovery:

  • Load balancer discovery: Kubernetes and the cloud provider together serve as load balancer, redirecting pod traffic.
  • NodePort discovery: Kubernetes uses special ports of node IP addresses to expose NodePort services.

This image depicts Kubernetes service discovery by showing an external network connecting to Kubernetes service to Pods.

Kubernetes Service Discovery FAQs

How Does Service Discovery Work in Kubernetes?

Kubernetes service discovery is an abstraction that allows an application running on a set of Pods to be exposed as a network service. This enables a set of Pods to run using a single DNS name, and allows Kubernetes load balancing across them all. This way as Pods are dynamically created and destroyed on demand, frontends and backends can continue to function and connect.

The Kubernetes service is an abstraction that defines both a set of Pods based on logic and a policy for accessing the Pods, or a micro-service. A selector typically determines the set of Pods a Service targets.

As a Kubernetes service discovery example, consider a stateless backend for data-processing that is running with multiple fungible replicas. The specific Pods on the backend may change, but this isn’t important to the clients on the frontend, who neither need to track nor be aware of the backends themselves. Kubernetes service discovery is the abstraction that makes this separation possible.

What are Kubernetes Containers?

How does Kubernetes do service discovery? There are three schemes in Kubernetes for exposing services:

  • ClusterIP. ClusterIP is the virtual IP address meant for pod communication within the cluster. ClusterIP services can expose their pods to the network. For example, Kubernetes exposes a database pod through a ClusterIP-based service to make it available to the web server pods.
  • NodePort. Typically used for services with external consumers, NodePort is used to expose a service on the same port across all cluster nodes. An internal routing mechanism ensures that requests made via NodePort are forwarded automatically to the appropriate predetermined destination pods on each node.
  • LoadBalancer. The load balancer component extends the NodePort service by adding Layer 4 (L4) and Layer 7 (L7) load balancers to a service, routing requests to the instances that can best handle them. Load balancers ensure that some containers don’t sit idle while others become overwhelmed with traffic. Clusters running in public cloud environments that support automated provisioning of software-defined load balancers often use this scheme.


When multiple services must share the same external endpoint or load balancer, an ingress controller may be indicated. An ingress controller provides secure socket layer (SSL) termination, load balancing, and name-based virtual hosting to manage external access to the services in a cluster.

There are three main Kubernetes service discovery methods: server-side discovery, client-side discovery, and DNS discovery.

Server-Side Service Discovery

Instance IPs can change without warning, making direct communication between services unpredictable. An intermediary such as a load balancer may be more reliable and promote better service discovery.

The load balancer or reverse proxy sits in front of a group of instances and constitutes a single service. From the perspective of the client, because the service discovery happens completely on the server-side, accessing the multi-instance service is just like accessing a single network endpoint.

The client makes an initial network request, triggering the server-side service discovery process. Kubernetes routes the request to a load balancer. However, not providing the right information may lead to making poor routing decisions; this is why the load balancer relies on the service registry to track and communicate with instances and relay their statuses.

Particularly in highly-loaded environments, the server-side service discovery has a few drawbacks. The load balancer component is a potential throughput bottleneck and a single point of failure, so a reasonable level of redundancy must be baked into the load balancing layer.

Client-side Service Discovery

By retaining the service registry and removing the load balancer from the equation in an attempt to improve service discovery, we arrive at client-side discovery. In practice, the most famous real-world example of client-side service discovery is the Netflix Eureka project.

These methods eliminate the load balancer as a single point of failure and thus reduce occasions for bottlenecking. The client:

  • Retains the service registry
  • Directly looks up the available service instance addresses in the service registry
  • Fetches the service fleet, a complete list of IP addresses
  • Determines which instances are viable
  • Selects an optimal instance based on available load balancing strategies
  • Sends a request to the preferred instance and awaits a response


The benefits of the client-side approach arise from the removal of the load balancer which ensures there is less chance of throughput bottleneck and no single point of failure, not to mention less equipment to cope with.

However, as with the server-side service discovery, client-side service discovery has some significant drawbacks. Client-side service discovery complicates the clients with extra logic, requiring integration code for every framework or programming language in the ecosystem and coupling clients with the service registry.

DNS Discovery

Finally, the concept of DNS service discovery exists. In this process, the client uses a DNS PTR record to reveal a list of instances and resolves to a domain name connected to a working instance.

However, some question whether this is complete service discovery. DNS creates either client-side or server-side solutions, more akin to the service registry. There are several issues with DNS discovery on the whole, including slow updating of DNS records which causes clients to wait on service ports and instance statuses even longer.

DNS is generally ill-suited for service discovery, including inside the Kubernetes ecosystem. This is why Kunernetes introduces one more IP address for every service rather than using round-robin DNS to list IP addresses of Pods. This is called clusterIP (not to be confused with the ClusterIP service type), a virtual IP.

Does VMware NSX Advanced Load Balancer Offer a Kubernetes Service Discovery Tool?

Appliance-based load balancing solutions are obsolete thanks to microservices-based modern application architectures. Kubernetes clusters deploying containerized applications need enterprise-class, scalable Kubernetes Ingress Services for monitoring/analytics service discovery, load balancing, local and global traffic management, and security.

VMware NSX Advanced Load Balancer’s advanced multi-cloud application services Kubernetes ingress controller offers enterprise-grade features, high levels of automation derived from machine learning, and enough observability to usher container-based applications into enterprise production environments.

Based on scalable, software-defined architecture, Vantage Kubernetes container services go far beyond what typical Kubernetes service controllers deliver. Expect a rich set of rollout  and application maintenance tools as well as observability, security, and traffic management. VMware NSX Advanced Load Balancer’scentrally orchestrated, elastic proxy services fabric provides dynamic load balancing, analytics, security, micro-segmentation, and service discovery for containerized applications running in Kubernetes environments.

Find out more about VMware NSX Advanced Load Balancer’s Kubernetes service discovery here.

For more on the actual implementation of load balancing, security applications and web application firewalls check out our Application Delivery How-To Videos.

Kubernetes Container

<< Back to Technical Glossary

Kubernetes Container Definition

Kubernetes is an open-source, extensible, portable container management platform. Kubernetes has a sizable ecosystem that is designed for facilitating both automation and declarative configuration and managing containerized workloads and services.

What is a container in Kubernetes? Kubernetes containers resemble virtual machines (VMs), each with its own CPU share, filesystem, process space, memory, and more. However, Kubernetes containers are considered lightweight because:

  • they can share the Operating System (OS) among applications due to their relaxed isolation properties
  • they are decoupled from the underlying infrastructure
  • they are portable across OS distributions and clouds


Each running Kubernetes container is repeatable. Users can expect the same behavior from a Kubernetes container regardless of its environment because included dependencies standardize performance.

Decoupling applications from underlying host infrastructure makes it simpler to deploy in various OS or cloud environments.


This image depicts a kubernetes container deploying multiple applications.

Kubernetes Containers FAQs

Wha are Kubernetes Containers?

What are containers in Kubernetes? Before containers, users typically deployed one application per virtual machine (VM), because deploying multiple applications could trigger strange results when shared dependencies were changed on one VM. Essentially, Kubernetes containers virtualize the host operating system and isolate the dependencies of an application from other running containers in the same environment.

Running a single application for each VM resolves this issue, but wastes CPU and memory resources that should be available to the application. Kubernetes containers instead use a container engine to run applications that can all use the same operating system in containers isolated from other applications on the host VM. Kubernetes containers also use a container image, a ready-to-run software package of an application and its dependencies that contains everything needed to run an application: code, required runtime, system and application libraries, and essential setting default values. This reduces costs and allows for higher resource utilization.

Benefits of Kubernetes Containers

Kubernetes containers offer a number of benefits, including:

  • Agile creation and deployment of applications and container images compared to VMs
  • Image immutability allows for more frequent, reliable container image build and efficient, speedy rollbacks during deployment
  • Development and Ops concerns decoupled with infrastructure and applications as container images are created at build/release time rather than deployment time
  • Enhanced observability of OS-level metrics and application health
  • Environmental consistency across machines and clouds through development, testing, and production
  • Portable distribution on major public clouds, on-premises, on CoreOS, RHEL, Ubuntu, and elsewhere
  • Runs application using logical resources on an OS for application-centric focus
  • Distributed, dynamic microservices application environment contrasts with larger single-purpose machine running a monolithic stack
  • Resource isolation results in predictable performance
  • High resource utilization and density


Kubernetes containers support an extremely diverse variety of workloads, including stateful and stateless applications and data-processing workloads. Kubernetes containers can run any container applications.

Furthermore, Kubernetes eliminates the need for orchestration or centralized control. Kubernetes includes multiple, independent control processes that drive the system towards the desired state continuously regardless of the specific order of steps. This produces a more dynamic, extensible, powerful, resilient, and robust system that is more user-friendly.

What Are Containers and Kubernetes and How Do They Work?

Kubernetes comprises several components deployed as a cluster that interact together. A Kubernetes container cluster serves as the basic Kubernetes architecture and a sort of motherboard or central nervous system orchestrating applications and running pods as defined by users.

A Kubernetes container uses a Kubernetes container runtime and runs logically in a pod and houses the cluster. A group of pods run on a cluster, whether related or not. Nodes, physical or virtual machines that exist between the pod and cluster, host the pods.

Each Kubernetes container cluster is made up of at least one worker node, worker machines that run containerized applications. The control plane manages the Kubernetes pods and worker nodes in the cluster.

Control plane components include:

kube-apiserver: the API server is the front end for the Kubernetes control plane and exposes the Kubernetes API. The kube-apiserver is designed to scale by deploying more instances, horizontally—traffic balances between the instances.

etcd: the etcd is a highly-available, consistent key value store for all cluster data.

kube-scheduler: kube-scheduler assigns newly created pods to nodes based on affinity and anti-affinity specifications, deadlines, data locality, hardware/software/policy constraints, individual and collective resource requirements, and inter-workload interference.

kube-controller-manager: the kube-controller-manager runs each separate controller process by compiling them into one single process to reduce complexity.

cloud-controller-manager: cloud-controller-manager links cloud APIs and clusters with cloud-specific control logic to enable interactions.

kubectl: kubectl allows users to run Kubernetes container commands against clusters. It is installable on Windows, macOS, and a variety of Linux platforms. kubectl can help users inspect and manage cluster resources, deploy applications, and view logs.

kubelet: kubelet runs on all cluster nodes to ensure containers in a pod are running and healthy.

kube-proxy: kube-proxy runs on each cluster node, maintaining network communication rules and implementing the Kubernetes Service concept.

Kubernetes container runtime: Kubernetes container runtime is the software implementation of the Kubernetes CRI (Container Runtime Interface) that runs containers. Kubernetes supports many container runtimes, including containerd, Docker Engine, CRI-O, and Mirantis Container Runtime. Docker is the most frequently used Kubernetes container runtime, which is why some Kubernetes container management discussion does include general Docker terms.

Docker Container vs Kubernetes

It may be tempting to make a Kubernetes container vs Docker container comparison, because both present comprehensive container management solutions for applications with impressive capabilities. However, they have different origins, solve different problems, and are therefore not precisely comparable.

Here are a few differences:

  • Kubernetes is designed to run across a cluster, in contrast to Docker, which runs on a single node
  • Kubernetes needs a container runtime to orchestrate, while Docker can be used without Kubernetes
  • Kubernetes is designed to include custom plugins that build out into custom solutions, while it is simple to run a Docker build on a Kubernetes cluster
  • Both Kubernetes and Docker Swarm are orchestration technologies, but Kubernetes is agnostic about ecosystems while Docker Swarm is closely integrated with the Docker ecosystem
  • Kubernetes has become the container management and orchestration de facto standard, while Docker has become better-known for container development and deployment


Does VMware NSX Advanced Load Balancer Offer Kubernetes Container Monitoring?

Yes. Vantage delivers multi-cloud application services such as load balancing for containerized applications with microservices architecture through dynamic service discovery, application traffic management, and web application security. Container Ingress provides scalable and enterprise-class Kubernetes ingress traffic management, including local and global server load balancing (GSLB), web application firewall (WAF) and performance monitoring, across multi-cluster, multi-region, and multi-cloud environments. The VMware NSX Advanced Load Balancer integrates seamlessly with Kubernetes for container and microservices orchestration and security.

Learn more about the universality, security, and observability of VMware NSX Advanced Load Balancer’s Kubernetes container monitoring solution.

For more on the actual implementation of load balancing, security applications and web application firewalls check out our Application Delivery How-To Videos.

Kubernetes Monitoring

<< Back to Technical Glossary

Kubernetes Monitoring Definition

Kubernetes monitoring is a type of reporting that helps identify problems in a Kubernetes cluster and implement proactive cluster management strategies. Kubernetes cluster monitoring tracks cluster resource utilization, including storage, CPU and memory. This eases containerized infrastructure management. Many organizations go beyond this inherent monitoring functionality to gain full visibility over cluster activity with full suites of cloud-native monitoring tools.

This image depicts kubernetes monitoring software tracking cluster resource utilization, including storage, CPU and memory.


What is Kubernetes Monitoring?

Kubernetes requires distinct approaches to monitoring traditional, long-lived hosts such as physical machines and VMs. A Kubernetes-based architecture’s inherent abstraction offers a framework for comprehensive application monitoring in a dynamic container environment. By tailoring a monitoring approach to complement the built-in abstractions of a Kubernetes system, comprehensive insights into application performance and health are possible, despite the constant motion between the containers running the applications.

Kubernetes container monitoring differs from traditional monitoring of more static resources in several ways.

Kubernetes identifies which services and pods belong together using labels. A container environment tracks even larger numbers of objects with even shorter lifespans. Labels are the only reliable way to identify and track applications and the pods they are in, thanks to the scalability and automation inherent to a Kubernetes system.

Aggregate data with labels from containers and pods to get continuous visibility into Kubernetes objects such as services. Label pods to connect events and metrics to the various layers of Kubernetes architecture and keep your observability data more actionable.

Another difference between traditional, host-centric infrastructure and Kubernetes architecture is the additional layers of abstraction that are part of K8s systems. More abstraction means additional components to monitor.

Older systems presented two main layers to monitor: hosts and applications. Containers added a layer of abstraction between applications and hosts. Kubernetes, yet another layer of comprehensive infrastructure, orchestrates containers and also requires monitoring.

Thus, four distinct components, each with its own challenges, is part of Kubernetes application monitoring:

  • Hosts, regardless of which applications/containers they are running
  • Containers, wherever they are running
  • Containerized applications
  • The entire Kubernetes cluster

Additionally, since applications are always in motion and highly distributed to monitor the health of your Kubernetes infrastructure, it’s essential to collect metrics and events from all your containers and pods, including the applications actually running in them.

Kubernetes schedules workloads automatically and things move rapidly, so users typically have very little control over where applications are running. (Users can assign node affinity or anti-affinity to particular Kubernetes pods, but to benefit most from its automatic resource management and scheduling, most users delegate that control to Kubernetes.)

It’s not possible to manually configure checks to collect monitoring data from applications upon each start or restart given the rate of change in a typical Kubernetes cluster. Kubernetes monitoring tools with service discovery enable users to maximize the inherent automation and scalability of Kubernetes without sacrificing visibility.

Even as containerized workloads contract, expand, or shift across hosts, service discovery enables continuous monitoring. Service discovery in Kubernetes automatically re-configures the data collection and enables the Kubernetes monitoring system to detect any change in the inventory of pods running.

Kubernetes Metrics Monitoring

Find important Kubernetes monitoring metrics using the Kubernetes Metrics Server. Kubernetes Metrics Server collects and aggregates data from the kubelet on each node. Consider some of these key Kubernetes metrics:

  • API request latency, the lower the better, measured in milliseconds
  • Cluster state metrics, including the availability and health of pods
  • CPU utilization in relation to per pod CPU resource allocation
  • Disk utilization including lack of space for file system and index nodes
  • Memory utilization at the node and pod levels
  • Node status, including disk or processor overload, memory, network availability, and readiness
  • Pod availability (unavailable pods may indicate poorly designed readiness probes or configuration issues)

The Need: How to Monitor Kubernetes

At the enterprise level, containers have experienced explosive growth. Kubernetes in business offers many benefits to DevSecOps, developers, and IT teams. However, deploying containerized applications with Kubernetes delivers scalability and flexibility that are themselves a challenge.

Servers and applications are no longer correlated at a 1-to-1 ratio. Applications are abstracted more than once, by containers and by Kubernetes, so tracking application health without the proper tools is impossible. Here are some Kubernetes monitoring best practices to keep in mind.

Monitoring Kubernetes cluster nodes. Acquire a broad view of overall platform capacity and health by monitoring the Kubernetes cluster. Monitor cluster resource and infrastructure usage to determine whether the cluster is underutilized or over capacity. Node health and availability are important to monitor to reveal whether there are sufficient resources and nodes available to replicate applications. Finally, monitor chargeback or resource usage for each project and/or team.

Monitoring Kubernetes deployments and pods. Monitoring Kubernetes constructs such as deployments, namespaces, DaemonSets or ReplicaSets, ensures proper application deployment. Monitor failed and missing pods to determine how many pods fail and whether the pods are running for each application. Watch pod resource usage vs limits and requests to confirm that memory and CPU limits and requests are set and compare those with actual usage. And monitor running vs desired instances, specifically, how many instances for each service do you expect to be ready, and how many are actually ready?

Monitoring Kubernetes applications. This is more familiar monitoring for application availability and confirming that the application is responding. It also includes monitoring application health and performance. How many requests are there? Are there errors? Measure latency and responsiveness as well.

Monitoring Kubernetes containers. Monitoring tools rely on services as their endpoint because pods and their containers are dynamically scheduled and in constant motion. Even as individual pods and containers are created and deleted, services can communicate continually because services expose an IP address that can be accessed externally.

Monitoring Kubernetes pod health. The three metrics that touch upon Kubernetes pod health are the liveness, readiness, and startup condition probes. They are determined and managed by the kubelet.

The liveness probe helps identify when pods have become unresponsive and determine if a container within a pod needs to restart.

Only when all of the containers in a pod are ready is the pod itself ready. Pods that are not ready will not receive incoming traffic and will be removed from service load balancers. The readiness probes tell the cluster when pod containers are ready to start processing traffic.

The startup probe indicates if/when the application in the pod successfully starts. Both readiness and liveness probes are deactivated in the presence of a startup probe until the latter ensures the startup succeeds without interference from other probes.

Kubernetes Monitoring Tools

In other words, what is Kubernetes health and how is it monitored?

How to monitor Kubernetes nodes. The health of Kubernetes nodes directly affects their ability to run their assigned pods. The Kubernetes problem detector DaemonSet aggregates and sends data to the API server on problems from node metrics daemons reported as node events and conditions.

Kubernetes users often use open source tools that are deployed inside Kubernetes as monitoring solutions. These include Heapster/InfluxDB/Grafana and Prometheus/Grafana. It’s also possible to conduct Kubernetes monitoring with ELK Stack or a hosted solution (Heapster/ELK). Finally, proprietary APM solutions that offer Kubernetes monitoring are also on the market. Depending on your organization’s needs, an open source Kubernetes monitoring solution might be best, or a proprietary or hosted solution might have its benefits.

Here are some of the more common tools for Kubernetes monitoring.

Prometheus metrics. Prometheus is an open source system created by the Cloud Native Computing Foundation (CNCF). The Prometheus server collects data from nodes, pods, and jobs, and other Kubernetes health metrics after installing data exporter pods on each node in the cluster. It saves collected time-series data into a database, and generates alerts automatically based on preset conditions.

The Prometheus dashboard is limited, but users enhance it with external visualization tools such as Grafana, which enables customized and sophisticated debugging, inquiries, and reporting using the Prometheus database. Prometheus supports importing data from many third-party databases.

Kubernetes dashboard. The Kubernetes dashboard is a simple web interface for debugging containerized applications and managing cluster resources. The Kubernetes dashboard provides a rundown of all defined storage classes and all cluster namespaces, and a simple overview of resources, both on individual nodes and cluster-wide.

Kubernetes dashboard Admin view lists all the nodes and aggregated metrics for each along with persistent storage volumes. Config and storage view identifies persistent volume claims for all the Kubernetes resources running in the cluster and each clustered application. Workload view lists every running application by namespace, including the number of pods currently ready in a Deployment and current pod memory usage. And Discover view lists exposed services that have enabled discovery inside the cluster.

cAdvisor. cAdvisor collects metrics on historical data, resource usage, and resource isolation, from the cluster to the container.

Kubernetes applications always run in pods, so their health can be measured by the readiness and liveness probes in the pods. If applications are running on nodes that are not reporting any errors, and the applications themselves report they are ready to process new requests, the applications are probably healthy.

Why are Kubernetes Monitoring Tools Important?

Legacy monitoring tools fail at monitoring Kubernetes for several reasons. They are designed for monitoring known servers that didn’t change rapidly, and they focus on collecting metrics from static targets.

Kubernetes inherently increases the complexity of infrastructure. Any sort of platform that sits between application empowering services and other infrastructure such as Kubernetes demand monitoring.

Along with increasingly complex infrastructure, modern microservices applications massively increase the number of components communicating with each other. Containers migrate across infrastructure as needed, and each service can be distributed across multiple instances. Thus, to understand whether Kubernetes is working, it is essential to monitor the Kubernetes orchestration state and verify that all instances of the service are running.

The explosion in cloud-native architectures means a correlating explosion in scale requirements. Kubernetes monitoring tooling and methodology must retain enough granularity to inspect individual components while alerting users of high-level service objectives.

Traditional monitoring cannot manage the number of metrics generated by cloud-native architectures. In the past we knew where and how many of each instance there was of a service component, but Kubernetes adds multidimensionality, so the various perspectives or aggregations that must be managed can quickly spiral out of control.

Containers are transient; in fact over half last just a few minutes. This high level of churn means thousands of data points, and hundreds of thousands of time series, even in a small Kubernetes cluster. The best Kubernetes monitoring solutions must be capable of scaling to hundreds of thousands of metrics.

Finally, It’s difficult to see inside containers, and they are ephemeral. This makes them naturally tough to troubleshoot, blackboxes by design, in some sense. Monitoring tools for Kubernetes that offer granular visibility allow for more rapid troubleshooting.

Does VMware NSX Advanced Load Balancer Offer Kubernetes Monitoring Services?

Yes. VMware NSX Advanced Load Balancer provides a centrally orchestrated, elastic proxy services fabric with dynamic load balancing, ingress controller, service discovery, application security, and analytics for containerized applications running in Kubernetes environments.

Kubernetes monitoring demands a cloud-native approach which VMware NSX Advanced Load Balancer provides. The VMware NSX Advanced Load Balancer delivers scalable, enterprise-class container ingress to deploy and manage container-based applications in production environments accessing Kubernetes clusters. VMware NSX Advanced Load Balancer provides a container services fabric with a centralized control plane and distributed proxies:

  • Controller: A central control, management and analytics plane that communicates with the Kubernetes primary node. The Controller includes two sub-components called Kubernetes Operator (AKO) and Multi-Cloud Kubernetes Operator (AMKO), which orchestrate all interactions with the Kube-controller-manager. AKO is used for ingress services in each Kubenetes cluster and AMKO is used in the context of multiple clusters, sites, or across clouds. The Controller deploys and manages the lifecycle of data plane proxies, configures services and aggregates telemetry analytics from the Service Engines.
  • Service Engine: A service proxy providing ingress services such as load balancing, WAF, GSLB, IPAM/DNS in the dataplane and reporting real-time telemetry analytics to the Controller.

For more on the actual implementation of load balancing, security applications and web application firewalls check out our Application Delivery How-To Videos.

Find out more about how VMware NSX Advanced Load Balancer’s cloud-native approach for traffic management and application networking services can assist your organization’s Kubernetes monitoring here.

Kubernetes Service Mesh

<< Back to Technical Glossary

Kubernetes Service Mesh Definition

Cloud native applications frequently run in containers as part of a distributed microservices architecture. Kubernetes deployments have become the de-facto standard for orchestration of these containerized applications.

Microservices sprawl, a kind of exponential growth in microservices, is one unintended outcome of using microservices architecture. This kind of growth presents challenges within a Kubernetes cluster surrounding authentication and authorization, routing between multiple versions and services, encryption, and load balancing.

A service mesh is a mesh of Layer 7 proxies, not a mesh of services. Microservices can use a service mesh to abstract the network away, resolving many of the challenges arising from talking to remote endpoints within a Kubernetes cluster. Building on Kubernetes allows the service mesh to abstract away how inter-process and service to service communications are handled, as containers abstract away the operating system from the application.

This image depicts a kubernetes service mesh traffic overview of the control plane feeding into the frontend, backend, and database.

Kubernetes Service Mesh FAQs

What is Kubernetes Service Mesh?

A Kubernetes service mesh is a tool that inserts security, observability, and reliability features to applications at the platform layer instead of the application layer.

Service mesh technology predates Kubernetes. However, growing interest in service mesh solutions is directly related to the proliferation of Kubernetes-based microservices and a resulting interest in Kubernetes service mesh options.

Microservices architectures are heavily network reliant. Service mesh manages network traffic between services.

There are other ways to manage this network traffic, but they are less sustainable than service mesh because they demand more operational burden in the form of error-prone, manual labor from devops teams. Service mesh on Kubernetes achieves the same goal in a much more scalable manner.

The service mesh in Kubernetes is typically implemented as a set of network proxies. Deployed alongside a “sidecar” of application code, these proxies serve as an introduction point for service mesh features and manage communication between the microservices. The data plane of the Kubernetes service mesh is made up by the proxies, which the control plane controls.

Kubernetes and service mesh architectures arose as cloud native applications flourished. Hundreds of services may comprise any given application, and there may be thousands of instances of each service. Each of those instances demand dynamic scheduling as they change rapidly, which is where Kubernetes comes in.

Clearly, this is a highly complex system of service to service communications, but it’s also a basic, normal part of runtime behavior for a standard application. To ensure the app is reliable, secure, and performs well end-to-end, insightful management is essential.

How Does Kubernetes Service Mesh Work?

Distributed applications in any architectural environment, including the cloud, have always required rules to control how their requests get from place to place. A Kubernetes service mesh or any type of service mesh does not introduce new logic or functionality to the runtime environment. Instead, Kubernetes network service mesh abstracts the logic that controls service to service communications to a layer of infrastructure and out of individual services.

Service mesh layers atop Kubernetes infrastructure to render inter-service communications over the network reliable and safe. Kubernetes service mesh works similarly to how a tracking and routing service for shipped mail and packages does. It tracks routing rules and directs traffic and package routes dynamically based on those rules to ensure receipt and accelerate delivery.

The components of a service mesh include a data plane and a control plane. Lightweight proxies distributed as sidecars comprise the data plane. Users can deploy these proxy technologies to build a service mesh in Kubernetes. The proxies in Kubernetes are in every application-adjacent pod and are run as cycles.

The control plane configures the proxies, contains the policy managers, and issues the TLS certificates authority. It can also include the ability to perform tracing and can collect other metrics such as telemetry and can run other service mesh implementations.

In this way, Kubernetes service mesh enables users to separate an application’s business logic from policies controlling observability and security, allowing them to connect microservices and then secure and monitor them moving forward.

Service mesh in Kubernetes enables services to detect each other and communicate. It also uses intelligent routing to control API calls and the flow of traffic between endpoints and services. This further enables canaries or rolling upgrades, blue/green, and other advanced deployment strategies.

Service mesh on Kubernetes architectures also enables secure communication between services. For example, with Kubernetes network service mesh users can enforce communication policies that deny or allow specific types of communication—such as enforcing a policy that denies production services access to development environment client services.

Various Kubernetes service mesh options enable users to observe and monitor even highly distributed microservices systems. Kubernetes service mesh also frequently integrates with other tracing and monitoring tools to enable improved discovery and visualization of API latencies, traffic flow, dependencies between services, and tracing.

This level of functionality is essential to monitoring complex cloud native applications and the distributed microservices environments that comprise them. Observability and granular insights are critical for a higher level of operational control.

What is Istio Kubernetes Service Mesh?

Istio is an open source, Kubernetes service mesh example that has become the service mesh of choice for many major tech businesses such as Google, IBM, and Lyft. Istio shares the data plane and control plane that all service meshes feature, and is often made up of Envoy proxies. These proxies are deployed within each Kubernetes pod container as sidecars, establishing connections to other services and moderating communications with them.

The rules for managing this communication must be configured on the data plane. This component of the Istio service mesh is responsible for traffic management, protocol-specific fault injection, and several types of Layer 7 load balancing. This application layer load balancing stands in contrast to Kubernetes load balancing, which is just on Layer 4, the transport layer.

Other components collect metrics on traffic and respond to various data plane queries such as access control, authentication and authorization, or quota checks. They can also interface with monitoring and logging systems, depending on which adapters are enabled, and provide encryption and authentication policies and enforcement. For example, Istio supports TLS authentication and role-based access control.

Many other tools integrate with Istio to expand its capabilities.

Benefits of Service Mesh in Kubernetes

Microservices architecture has been a key step in the move towards cloud native architecture. While it provides flexibility, microservices architecture is also inherently complex. Container services can manage and deploy microservices architectures, but as they grow and sprawl, insight becomes more limited. This presents the main limitation for cloud native architecture, which demands deep insight for traffic management, security, and other critical functions.

A service mesh aids in resolving some of this complexity by providing the ability to use the services of multiple stack layers in a single infrastructure layer—all without requiring integration or code modification by your application developers. This makes communication between services faster and more reliable. Service mesh in Kubernetes also offers observability in the form of logging, tracing, and monitoring; granular traffic management; security in the form of encryption and authentication and authorization; and failure recovery.

In practice, using a Kubernetes service mesh makes it easier to implement security and encryption between services and reduces the burden on devops teams. A service mesh also makes tracing a service latency issue simpler. And although different service meshes provide different features, common capabilities include:

  • API (Kubernetes Custom Resource Definitions (CRD), programmable interface)
  • Communication resiliency (circuit-breaking, retries, rate limiting, timeouts)
  • Load balancing (consistent hashing, least request, zone/latency aware)
  • Observability (alerting, Layer 7 metrics, tracing)
  • Routing control (traffic mirroring, traffic shifting)
  • Security (authorization policies, end-to-end mutual TLS encryption, service level and method level access control)
  • Service discovery (distributed cache)


Application teams can implement a service mesh and deploy common implementations that fulfill standard requirements. The use of service meshes is related to the basic principle behind Kubernetes: it is a standard interface that runs applications and meets related needs.

What is Kubernetes Service Mesh?

The VMware NSX Advanced Load Balancer integrates with the Tanzu Service Mesh (TSM) which is built on top of Istio with value added services. By expanding on the TSM solution, VMware NSX Advanced Load Balancer offers north-south connectivity, security, and observability inside and across Kubernetes clusters, and multiple sites and clouds. In addition, enterprises are able to connect modern Kubernetes applications to traditional application components in VM environments and clouds, secure transactions from end-users to the application, and seamlessly bridge between multiple environments.

For more on the actual implementation of load balancing, security applications and web application firewalls check out our Application Delivery How-To Videos.

Kubernetes Networking

<< Back to Technical Glossary

Kubernetes Networking Definition

Kubernetes, sometimes called K8s, is an open-source container orchestration platform. It is used to automate and manage the deployment, maintenance, monitoring, operation, and scheduling of application containers across a cluster of machines, either in the cloud or on-premises.

Kubernetes allows teams to manage containerized workloads across multiple environments, infrastructures, and operating systems. It is supported by all major container management platforms such as AWS EKS, Docker EE, IBM Cloud, OpenShift, and Rancher. Core Kubernetes themes to understand before discussing Kubernetes networking include:

  • Primary node. The primary node manages the Kubernetes cluster’s worker nodes and controls pod deployment.
  • Worker node. Worker nodes are servers that generally run Kubernetes components such as application containers and proxies in pods.
  • Service. A service is an abstraction with a stable IP address and ports that functions as a proxy or internal load balancer for requests across pods.
  • Pod. The pod is the basic deployment object in Kubernetes, each with its own IP address. A pod can contain one or multiple containers.
  • Other Kubernetes components. Additional important Kubernetes system components include the Kubelet, the API Server, and the etcd.


Kubernetes networking was developed by Google. Administrators use Kubernetes cluster networking to move workloads across public, private, and hybrid cloud infrastructures. Developers package software applications and infrastructure required to run them together using Kubernetes and deploy new versions of software more quickly with Kubernetes.

Kubernetes networking enables communication between Kubernetes components and between components and other applications. Because its flat network structure eliminates the need to map ports between containers, the Kubernetes platform provides a unique way to share virtual machines between applications and run distributed systems without dynamically allocating ports.

There are several essential things to understand about Kubernetes networking:

  • Container-to-container networking: containers in the same pod communicating
  • Pod-to-pod networking, both same node and across nodes
  • Pod-to-service networking: pods and services communicating
  • DNS and internet-to-service networking: discovering IP addresses.


This image depicts kubernetes networking digram of controller-0, controller-1, controller-2 and worker-0, worker-1, and worker-2 connected through the same network.

Kubernetes Networking FAQs

What is Kubernetes Networking?

Networking is critical to Kubernetes, but it is complex. There are several Kubernetes networking basics for any Kubernetes networking implementation:

all Pods must be able to communicate with all other Pods without NAT (network address translation).
all Nodes and Pods can communicate without NAT.
Pods and others all see the same IP address that a Pod sees for itself.

These requirements leave a few distinct Kubernetes networking problems, and divide communication into several Kubernetes network types.

Container-to-container networking: communication between containers in the same pod

How do two containers running in the same pod talk to each other? This happens the same way multiple servers run on one device: via localhost and port numbers. This is possible because containers in the same pod share networking resources, IP address, and port space; in other words, they are in the same network namespace.

This network namespace, a collection of routing tables and Kubernetes network interfaces, contains connections between network equipment and routing instructions for network packets. Kubernetes assigns running processes to the root network namespace by default, allowing them external IP address access.

Network namespaces enable users to run the same virtual machine (VM) with many network namespaces without interference or collisions.

Pod-to-pod communication

Pod-to-pod networking can occur for pods within the same node or across nodes. Each node has a classless inter-domain routing (CIDR) block, a tool used for routing traffic and allocating IP addresses. The CIDR block contains a defined set of unique IP addresses that are assigned to the node’s pods. This is intended to ensure that, regardless of which node it is in, each pod has a unique IP address.

Pods connect to communicate using a virtual ethernet device (VED) or virtual ethernet device pair (veth pair). Always created in interconnected pairs, these coupled Kubernetes container network interfaces span the root and pod namespaces, bridging the gap and acting as an intermediary connection between devices.

The requests get between pods in the node using a network bridge which connects the two networks. When it receives a request, the bridge confirms that all connected pods can handle the request by confirming that they have the correct IP address. If one of the pods does have the correct IP address, the network request will be completed as the bridge forwards the data and stores the information.

Every pod on a Kubernetes node is part of the bridge, which is called cbr0. The bridge connects all of the node’s pods.

Pod-to-service networking: communication between pods and services

Pod IP addresses are fungible by design so they can be replaced as needed dynamically. These IP addresses disappear or appear in response to application crashes, scaling up or down, or node reboots.

Therefore, unless a team creates a stateful application or takes other special precautions, pod IP addresses are not durable. Because any number of events can cause unexpected changes to the pod IP address, Kubernetes uses built in services to deal with this and ensure seamless communication between pods.

Kubernetes services as part of a Kubernetes service network manage pod states over time, allowing the tracking of pod IP addresses. By assigning a cluster IP address—a single virtual IP address—to a group of pod IPs, these services abstract pod IP addresses. This allows them to first send traffic to the virtual IP address and then distribute it to an associated grouping of pods.

The virtual IP enables Kubernetes services to load balance in-cluster and distribute traffic smoothly among pods and enables pods to be destroyed and created on demand without impacting communications overall.

This pods to service process happens via a small process running inside each Kubernetes node: the kube proxy. The kube proxy process maps the virtual service IP addresses to the actual pods IP addresses so requests may proceed.

DNS and internet-to-service networking: discovering IP addresses

DNS is the system Kubernetes uses to convert domain names to IP addresses, and each Kubernetes cluster has a DNS resolution service. Every cluster’s service has an assigned domain name, and pods have a DNS name automatically as well. Pods can also specify their own DNS name.

When a request via the service domain name is made, the DNS service resolves it to the service IPs address. Next, the kube proxy converts the service IPs address into a pod IP address. Then the request follows a given path, depending on whether the destination pod is on the same node or on a different node.

Finally, most deployments require internet connectivity, enabling collaboration between distributed teams and services. To enable external access in Kubernetes and control traffic into and out of your network, you must set up egress and ingress policies with either allowlisting or denylisting.

Egress routes traffic from the node to an outside connection. Typically attached to a virtual private cloud (VPC), this is often achieved using an internet gateway that maps IPs between the host machine and the users with network address translation (NAT). Kubernetes uses cluster-IPs and IP tables to finalize communications, however, as it cannot map to the individual pods on the node.

Kubernetes networking ingress—getting traffic into the cluster—is a tricky challenge that involves communications to Kubernetes services from external clients. Ingress allows and blocks particular communications with services, operating as a defined set of rules for connections. Typically, there are two ingress solutions that function on different network stack regions: the service load balancer and the ingress controller.

You can specify a load balancer to go with a Kubernetes service you create. Implement the Kubernetes network load balancer using a compatible cloud controller for your service. After you create your service, the IP address for the load balancer will be what faces the public and other users, and you can direct traffic to to begin communicating with your service through that load balancer.

Packets that flow through an ingress controller have a very similar trajectory. However, the initial connection between node and ingress controller is via the exposed port on the node for each service. Also, the ingress controller can route traffic to services based on their path.

Kubernetes Network Policy

Kubernetes network policies enable the control of traffic flow for particular cluster applications at the IP address and port levels (OSI layers 3 and 4). An application-centric construct, Kubernetes network policies allow users to specify how pods and various network entities such as services and endpoints are permitted to communicate over the network.

A combination of the following 3 identifiers clarify how entities and pods can communicate:

  • A list of other allowed pods, although a pod cannot block access to itself;
  • A list of allowed namespaces; or
  • Via IP blocks that work like exceptions: traffic to and from the pod’s node is always allowed, regardless of the pod or node’s IP address.


Network policies defined by pods or namespaces are based on a selector that specifies what traffic is allowed to and from the pod(s) in question. Any traffic that matches the selector is permitted.

CIDR ranges define network policies based on IP blocks.

Implement network policies using network plugins designed to support the policies.

Pods are non-isolated by default, meaning without having a Kubernetes network policy applied to them, they accept traffic from any source. Once a network policy in a namespace selects a particular pod, that pod becomes isolated and will reject any connections that the policy does not allow even as other unselected pods in the namespace continue to accept all traffic.

Kubernetes network policies are additive. If any policies select and restrict a pod, it is subject to those combined ingress/egress rules of all policies that apply to it. The policy result is unaffected by order of evaluation.

For a complete definition of the resource, see the NetworkPolicy reference.

Common Kubernetes Pod Network Implementations

There many ways to implement a Kubernetes network model and many third-party tools to choose from, but these are a few popular Kubernetes network providers to choose from:

Flannel, an open-source Kubernetes network fabric, uses etcd to store configuration data, allocates subnet leases to hosts, and operates through a binary agent on each host.

Project Calico or just Calico, an open-source policy engine and networking provider, enables a scalable networking solution with enforced Kubernetes security policies on your service mesh layers or host networking.

Canal is a term based on a defunct project for an integration of the Flannel networking layer and the Calico networking Kubernetes policy capabilities. Canal offers the simple Flannel overlay and the powerful security of Calico for an easier way to enhance security on the network layer.

Weave Net, a proprietary toolkit for networking based on a decentralized architecture, includes virtual networking features for scalability, resilience, multicast networking, security, and service discovery. It also doesn’t require any external storage or configuration services.

Kubernetes Networking Challenges

As organizations bring Kubernetes into production, security, infrastructure integration, and other major bottlenecks to deployment remain. Here are some of the most common Kubernetes container networking challenges.

Differences from traditional networking. Since container migration is more dynamic by nature, so is Kubernetes network addressing. Classic DHCP can become a hindrance to speed, and traditional ports and hard-coded addresses can render securing the network a problem. Furthermore, particularly for large-scale deployments, there are likely to be too many addresses to assign and manage, and IPv4 limitations for hosts inside one subnet.

Security. Network security is always a challenge, and it is essential to specifically define Kubernetes network policies to minimize risk. Managed solutions, cloud-native services, and platforms help render Kubernetes deployment networks more secure and ensure flawed applications in Kubernetes managed containers are not overlooked. For example, there is no default Kubernetes network encryption for traffic, so the right service mesh can help achieve a Kubernetes encrypted network.

Dynamic network policies. The need to define and change Kubernetes network policies for a pod or container requires the creation of NetworkPolicy resources. This in essence demands creating configuration files for many pods and containers, and then modifying them in a process that can be tedious, difficult, and prone to error. Kubernetes networking solutions such as Calico can allow you to abstract vendor specific network APIs and implementations and change the policy more easily based on application requirements.

Abstraction. Kubernetes adoption can involve data center design or redesign, restructuring, and implementation. Existing data centers must be modified to suit container-based Kubernetes network architectures. Use of a cloud-native, software-defined environment suitable for both non-container and container-based applications is one solution to this issue, allowing you to define container traffic management, security policies, load balancing, and other network challenges through REST APIs. Such an environment also provides an automation layer for DevOps integration, rendering Kubernetes-based applications more portable across cloud providers and data centers.

Complexity. Particularly at scale, complexity remains a challenge for Kubernetes and any container-based networking. Kubernetes service mesh solutions for cloud native and container-based applications don’t require code changes and offer features such as routing, service discovery, visibility to applications, and failure handling.

Reliability of communication. As the number of pods, containers, and services increases, so does the importance—and challenge—of service-to-service communication and reliability. Harden Kubernetes network policies and other configurations to ensure reliable communications, and manage and monitor them on an ongoing basis.

Connectivity and debugging. As with any other network technology, Kubernetes network performance can vary, and Kubernetes networks can only succeed with the right tools for monitoring and debugging. Kubernetes network latency is its own challenge and network stalls may require debugging.

Does VMware NSX Advanced Load Balancer Offer Kubernetes Networking Monitoring and Application Services?

Yes. VMware NSX Advanced Load Balancer provides a centrally orchestrated, elastic proxy services fabric with dynamic load balancing, service discovery, security, and analytics for containerized applications running in Kubernetes environments.

Kubernetes adoption demands a cloud-native approach for traffic management and application networking services, which VMware NSX Advanced Load Balancer provides. The VMware NSX Advanced Load Balancer delivers scalable, enterprise-class container ingress to deploy and manage container-based applications in production environments accessing Kubernetes clusters.

VMware NSX Advanced Load Balancer provides a container services fabric with a centralized control plane and distributed proxies:

Controller: A central control, management and analytics plane that communicates with the Kubernetes primary node. The Controller includes two sub-components called Kubernetes Operator (AKO) and Multi-Cloud Kubernetes Operator (AMKO), which orchestrate all interactions with the Kube-controller-manager. AKO is used for ingress services in each Kubenetes cluster and AMKO is used in the context of multiple clusters, sites, or across clouds The Controller deploys and manages the lifecycle of data plane proxies, configures services and aggregates telemetry analytics from the Service Engines.

Service Engine: A service proxy providing ingress services such as load balancing, WAF, GSLB, IPAM/DNS in the dataplane and reporting real-time telemetry analytics to the Controller.

For more on the actual implementation of load balancing, security applications and web application firewalls check out our Application Delivery How-To Videos.

Find out more about how VMware NSX Advanced Load Balancer’s cloud-native approach for traffic management and application networking services can assist your organization’s Kubernetes adoption here.

Kubernetes Security

<< Back to Technical Glossary

Kubernetes Security Definition

Kubernetes is an extensible, portable, open-source container orchestration platform that dominates the enterprise market. A huge number of organizations manage some portion of their container workloads and services using Kubernetes, making it a massive, rapidly expanding ecosystem. This means Kubernetes security tools, support, and services are widely available, but there are also serious security risks in Kubernetes container environments, including runtime security incidents, misconfigurations, and other Kubernetes security vulnerabilities.

Kubernetes security risks generally correspond to phases in the container lifecycle. Therefore, best practices and practical recommendations for Kubernetes security—sometimes referred to as Kubernetes container security—are linked to responding correctly to threats at runtime, avoiding misconfigurations during the deploy and build phases, and remediating known vulnerabilities during the build phase. These Kubernetes security best practices are essential to securing cloud-native infrastructure and applications.

Image depicts a Kubernetes Security diagram of the 4C's of cloud native security: cloud, clusters, containers, and code.


Kubernetes Security FAQs

What is Kubernetes Security?

Kubernetes security vulnerabilities and challenges are varied and numerous:

Containers are widespread. Containers enable greater portability, speed, and ability to leverage microservices architectures, but they can also increase your attack surface and create security blind spots. Their distributed nature also makes identifying risks, misconfigurations, and vulnerabilities quickly more difficult. It is also more challenging to maintain adequate visibility into your cloud-native infrastructure as more containers are deployed.

Misused images and image registries can pose security risks. Businesses need strong governance policies for building and storing images in trusted image registries. Build container images using approved, secure base images that are scanned regularly. Launch containers in a Kubernetes environment with only images from allowlisted image registries.

Containers must communicate. Containers and pods talk to each other and to internal and external endpoints to function properly. This makes for a sprawling deployment environment in which it is often prohibitively difficult to implement network segmentation. A malicious actor has the potential to move about broadly inside the environment should they breach a container, based on how broadly it communicates with other containers and pods.

Default Kubernetes configuration options are typically the least secure. Kubernetes is designed to simplify management and operations and speed application deployment in keeping with DevOps principles. Thus, a rich set of controls exists for securing clusters and their applications effectively.

For example, Kubernetes pod security policies control how pods communicate, much like firewall rules. When a pod has an associated Kubernetes pod security policy, it is only allowed to communicate with the assets that network policy defines. However, Kubernetes does not apply a network policy to a pod by default, meaning that all pods can talk to each other in a Kubernetes environment, leaving them open to risk.

Management, access, and storage of secrets and sensitive data such as keys and credentials present another configuration risk in containers, and should be mounted into read-only volumes, not passed as environment variables.

Compliance challenges for Kubernetes and containers are unique. Internal organizational policies, industry standards and benchmarks, and security best practices were often created not for cloud-native environments, but for traditional application architectures. Businesses must automate audits and monitoring to successfully operate at scale, given the dynamic and distributed nature of containerized applications.

Kubernetes runtime security challenges for containers are both familiar and new. Kubernetes can be treated as immutable infrastructure, destroyed and recreated from a common template rather than changed or patched when it’s time to update. This is a security advantage of containers, but their launch and removal speed and general ephemerality is also a challenge. Detecting a threat in a running container means stopping and relaunching—but not without identifying the root problem and reconfiguring whatever component caused it. Runtime security risks also include running malicious processes in compromised containers; crypto mining and network port scanning for open paths to valuable resources are examples of this.

To cope with these Kubernetes security concerns and others, it is essential to integrate security into each phase of the container lifecycle: building, deployment, and runtime. Following best practices for building and configuring your Kubernetes cluster and deployments and securing Kubernetes infrastructure reduces overall risk.

Kubernetes Security Best Practices

Build Phase

To secure Kubernetes clusters and containers, start in the build phase by building secure container images and scanning images for known vulnerabilities. Best practices include:

Use minimal base images. Avoid using base images with shell script or OS package managers. If you must include OS packages which could contain vulnerabilities, at a later step, remove the OS package manager.

Remove extra components. Remove debugging tools from containers in production. Do not include or retain common tools that could be useful to attackers in images.

Update images and third-party tools. All images and tools you include should be up to date, with the latest versions of their components.

Identify known vulnerabilities with a Kubernetes security scanner. Scan images by layer for vulnerabilities in third-party runtime libraries and OS packages your containerized applications use. Identify vulnerabilities within your images and determine whether they are fixable. Label non-fixable vulnerabilities and add them to a filter or allowlist so the team does not get hung up on non-actionable alerts.

Integrate security. Use image scanning and other Kubernetes security testing tools to integrate security into your CI/CD pipeline. This automates security and generates alerts when fixable, severe vulnerabilities are detected.

Implement defense-in-depth and remediation policies. Identifying a security risk in a running deployment using a container image demands immediate action, so keep a remediation workflow and policy checks in place. This allows the team to detect such images, and act to update them immediately.

Deploy Phase

Before deployment, Kubernetes infrastructure must be configured securely. This demands visibility into what types of Kubernetes infrastructure will be deployed, and how. To properly engage in Kubernetes security testing and identify and respond to security policy violations, you need to know:

  • what will be deployed, including the pods that will be deployed, and image information, such as vulnerabilities or components
  • where deployment will happen, including which namespaces, clusters, and nodes
  • the shape of the deployment—for example communication permissions, and pod security context
  • what it can access, including volumes, secrets, and other components such as the orchestrator API or host
  • compliance—does it meet security requirements and policies?


Based on these factors, follow these best practices for Kubernetes security during deployment:

Isolate sensitive workloads and other Kubernetes resources with namespaces. A key isolation boundary, namespaces provide a reference for access control restrictions, network policies, and other critical security controls. Limit the impact of destructive actions or errors by authorized users and help contain attacks by separating workloads into namespaces.

Deploy a service mesh. A service mesh tightly integrates with the infrastructure layer of the application, offering a consistent way to secure, connect, and observe microservices. A service mesh controls how various services share data in their east west communication in a distributed system, usually with sidecar proxies.

A service mesh delivers dynamic traffic management and service discovery, including traffic splitting for incremental rollout, canary releasing, and A/B testing, and traffic duplicating or shadowing. Situated along the critical path for all system requests, a service mesh can also offer insights and added transparency into latency, frequency of errors, and request tracing. A service mesh also supports cross cutting requirements, such as reliability (circuit-breaking and rate limiting) and security (providing TLS and service identity).

Control traffic between clusters and pods with Kubernetes network policies. Default Kubernetes configurations that allow every pod to communicate are risky. Network segmentation policies can prevent cross-container lateral movement by an attacker.

Limit access to secrets. To prevent unnecessary exposure, ensure deployments access only secrets they require.

Assess container privileges. Kubernetes security assessment should consider container capabilities, privileges, and role bindings, which all come with security risk. The least privilege that allows intended function and capabilities is the goal.

Control pod security attributes, including privilege levels of containers, with pod security policies. These allow the operator to specify, for example:

  • Do not allow privilege escalation.
  • Do not run application processes as root.
  • Use a read-only root filesystem.
  • Drop unnecessary and unused Linux capabilities.
  • Do not use the host network or process space.
  • Give each application its own Kubernetes Service Account.
  • Use SELinux options for more fine-tuned control.
  • If it does not need to access the Kubernetes API, do not mount the service account credentials in a container.


Assess image provenance. To maintain Kubernetes security, don’t deploy code from unknown sources, and use images from allowlisted or known registries only.

Extend image scanning into the deploy phase. Vulnerabilities can be disclosed in between deployment and scanning, so enforce policies at the deploy phase by rejecting images built over 90 days ago, or using an automated tool.

Use annotations and labels appropriately so teams can identify Kubernetes security issues and respond to them easily.

Enable Kubernetes role-based access control (RBAC). Kubernetes RBAC controls access authorization to a cluster’s Kubernetes API servers, both for service accounts and users in the cluster.

Runtime Phase

During the runtime phase, containerized applications experience a range of new security challenges:

Leverage contextual information in Kubernetes. Kubernetes build and deploy time can help your team evaluate actual versus expected runtime activity to identify suspicious activity.

Extend vulnerability scanning and monitoring to container images in running deployments. This should include newly discovered vulnerabilities.

Tighten security with built-in Kubernetes controls. Limit the capabilities of pods to eliminate classes of attacks that require privileged access. For example, read-only root file systems can prevent any attacks that depend on writing to the file system or installing software.

Monitor active network traffic. Limit insecure and unnecessary communication by comparing existing traffic to allowable traffic based on Kubernetes network security policies.

Leverage process allowlisting to identify unexpected running processes. Creating this kind of allowlist from scratch can be challenging; look to security vendors with expertise in Kubernetes and containers.

Analyze and compare runtime activity of the same deployments in different pods. Containerized applications may be replicated for fault tolerance, high availability, or scale. Replicas should behave almost identically; if they do not, investigate further.

Scale suspicious pods or stop and restart in case of Kubernetes security breach. Contain a successful breach using Kubernetes native controls by instructing them to stop and then restart instances of breached applications or automatically scale suspicious pods to zero.

Follow the CIS benchmarks for Kubernetes security best practices as well.

Does VMware NSX Advanced Load Balancer provide a Kubernetes Security Solution?

VMware NSX Advanced Load Balancer is based on a scale-out, software-defined architecture that provides observability, traffic management, Kubernetes security, and a rich set of tools to ease rollouts and application maintenance. VMware NSX Advanced Load Balancer offers an elastic, centrally orchestrated proxy services fabric with analytics, cloud-native web application security, and load balancing and ingress services for container-based applications running in Kubernetes environments.

The VMware NSX Advanced Load Balancer provides the cloud-native approach for application networking services and traffic management that enterprises adopting Kubernetes need. VMware NSX Advanced Load Balancer delivers scalable, enterprise-class container ingress to deploy and manage cloud-native applications in Kubernetes environments

Learn how to secure your application and data by making intent-based decisions on whether to authorize, block, or quarantine access based on a Common Vulnerability Scoring System (CVSS) score above a predefined threshold here.

For more on the actual implementation of load balancing, security applications and web application firewalls check out our Application Delivery How-To Videos.

Kubernetes Load Balancer

<< Back to Technical Glossary

Kubernetes Load Balancer Definition

Kubernetes is an enterprise-level container orchestration system. In many non-container environments load balancing is relatively straightforward—for example, balancing between servers. However, load balancing between containers demands special handling.

A core strategy for maximizing availability and scalability, load balancing distributes network traffic among multiple backend services efficiently. A range of options for load balancing external traffic to pods exists in the Kubernetes context, each with its own benefits and tradeoffs.

Load distribution is the most basic type of load balancing in Kubernetes. At the dispatch level load distribution is easy to implement. Each of the two methods of load distribution that exist in Kubernetes operate through the kube-proxy feature. Services in Kubernetes use the virtual IPs which the kube-proxy feature manages.

The former default kube-proxy mode was userspace, which allocates the next available Kubernetes pod using round-robin load distribution on an IP list, and then rotates or otherwise permutes the list. Modern kube-proxy default mode, called iptables, enables sophisticated rule-based IP management. In iptables mode, random selection is the native method for load distribution. In this situation, an incoming request goes to one of a service’s pods that is randomly chosen.

However, neither of these techniques provides true load balancing. The Ingress load balancer for Kubernetes offers the most flexible and popular method, as well as cloud service-based load-balancing controllers and other tools for Ingress from service providers and other third parties.

Ingress operates using a controller which includes an Ingress resource and a daemon which applies its rules in a specialized Kubernetes pod. The controller has its own built-in capabilities, including sophisticated load balancing features. In an Ingress resource, you can also adjust for specific vendor or system requirements and load-balancing features by including more detailed load-balancing rules.


Image depicts a Kubernetes Load Balancer diagram of application clients (end users) implementing load balancers for balanced Kubernetes clusters.


Kubernetes Load Balancer FAQs

What is Kubernetes Load Balancer?

Kubernetes is an extensible, portable, open-source platform for managing containerized services and workloads. Its large, rapidly growing ecosystem facilitates both automation and declarative configuration and Kubernetes support and tools are widely available.

In the internet’s earlier days, organizations experienced resource allocation issues as they ran applications on physical servers. One solution was to run separate applications on unique physical servers, but this is expensive and lacks ability to scale.

Virtualization was the next step which allowed organizations to run multiple Virtual Machines (VMs) on the CPU of a single physical server. Virtualization enables improved scalability, better utilization of resources, reduced hardware costs, and more. With virtualization, each VM runs all components on top of the virtualized hardware separately, including its own operating system.

Containers resemble VMs, except they are considered more lightweight, and can share the Operating System (OS) among the applications due to their relaxed isolation properties. However, like a VM, a container is portable across OS distributions and clouds, and amenable to being run by a system—such as Kubernetes.

In Kubernetes, a pod is a set of containers that are related by function. A set of related pods that have the same set of functions is a service. This allows Kubernetes to create and destroy pods automatically based on need, without additional input from the user, and pods are designed not to be persistent.

IP addresses for Kubernetes pods are not persistent because the system assigns each new pod a new IP address. Typically, therefore, direct communication between pods is impossible. However, services have their own relatively stable IP addresses which field requests from external resources. The service then dispatches the request to an available Kubernetes pod.

Kubernetes load balancing makes the most sense in the context of how Kubernetes organizes containers. Kubernetes does not view single containers or individual instances of a service, but rather sees containers in terms of the specific services or sets of services they perform or provide.

The Kubernetes pod, a set of containers, along with their shared volumes, is a basic, functional unit. Containers are typically closely related in terms of services and functions they provide.

Services are sets of Kubernetes pods that have the same set of functions. These Kubernetes services stand in for the individual pods as users access applications, and the Kubernetes scheduler ensures that you have the optimal number of pods running at all times by creating and deleting pods as needed. In other words, Kubernetes services are themselves the crudest form of load balancing traffic.

In Kubernetes the most basic type of load balancing is load distribution. Kubernetes uses two methods of load distribution. Both of them are easy to implement at the dispatch level and operate through the kube-proxy feature. Kube-proxy manages virtual IPs for services.

The default kube-proxy mode for rule-based IP management is iptables, and the iptables mode native method for load distribution is random selection. Previously, kube-proxy default mode was userspace, with its native method for load distribution being round-robin.

There are several cases when you might access services using the Kubernetes proxy:

  • Allowing internal traffic
  • Connecting directly to them directly from a computer
  • Debugging services
  • Displaying internal dashboards


However, you should not use this method for production services or to expose your service to the internet. This is because the kube proxy requires you to run kubectl as an authenticated user.

In any case, for true load balancing, Ingress offers the most popular method. Ingress operates using a controller with an Ingress resource and a daemon. The Ingress resource is a set of rules governing traffic. The daemon applies the rules inside a specialized Kubernetes pod. The Ingress controller has its own sophisticated capabilities and built-in features for load balancing and can be customized for specific vendors or systems.

A cloud service-based Kubernetes external load balancer may serve as an alternative to Ingress, although the capabilities of these tools are typically provider-dependent. External network load balancers may also lack granular access at the pod level.

There are many varieties of Ingress controllers, with various features, and a range of plugins for Ingress controllers, such as cert-managers that provision SSL certificates automatically.

How to Configure Load Balancer in Kubernetes?

Load balancing, a critical strategy for maximizing availability and scalability, is the process of distributing network traffic efficiently among multiple backend services. A number of Kubernetes load balancer strategies and algorithms for managing external traffic to pods exist. Each has its strengths and weaknesses.

Round Robin

In a round robin method, a sequence of eligible servers receive new connections in order. This algorithm is static, meaning it does not account for varying speeds or performance issues of individual servers, so a slow server and a better performing server will still receive an equal number of connections. For this reason, round robin load balancing is not always ideal for production traffic and is better for basic load testing.

Kube-proxy L4 Round Robin Load Balancing

The most basic default Kubernetes load balancing strategy in a typical Kubernetes cluster comes from the kube-proxy. The kube-proxy fields all requests that are sent to the Kubernetes service and routes them.

However, because the kube-proxy is actually a process rather than a proxy, it uses iptables rules to implement a virtual IP for the service, adding architecture and complexity to the routing. With each request, additional latency is introduced, and this problem grows with the number of services.

L7 Round Robin Load Balancing

In most cases, it is essential to route traffic directly to Kubernetes pods and bypass the kube-proxy altogether. Achieve this with an API Gateway for Kubernetes that uses a L7 proxy to manage requests among available Kubernetes pods.

The load balancer tracks the availability of pods with the Kubernetes Endpoints API. When it receives a request for a specific Kubernetes service, the Kubernetes load balancer sorts in order or round robins the request among relevant Kubernetes pods for the service.

Consistent Hashing/Ring Hash

In consistent hashing algorithms, the Kubernetes load balancer distributes new connections across the servers using a hash that is based on a specified key. Best for load balancing large numbers of cache servers with dynamic content, this algorithm inherently combines load balancing and persistence.

This algorithm is consistent because there is no need to recalculate the entire hash table each time a server is added or removed. Visualizing a circle or ring of nine servers in a pool or cache, adding a tenth server does not force a re-cache of all content. Instead, based on the outcome of the hash the nine servers already there send about 1/9, an even proportion, of their hits to the new server. Other connections are not disrupted.

The consistent or ring hash approach is used for sticky sessions in which the system ensures the same pod receives all requests from one client by setting a cookie. This method is also used for session affinity, which requires client IP address or some other piece of client state.

The consistent hashing approach is useful for shopping cart applications and other services that maintain per-client state. The need to synchronize states across pods is eliminated when the same clients are routed to the same pods. The likelihood of cache hit also increases as client data is cached on a given pod.

The weakness of ring hash is that client workloads may not be equal, so evenly distributing load between different backend servers can be more challenging. Furthermore, particularly at scale, the hash computation cost can add some latency to requests.

Google’s Maglev is a type of consistent hashing algorithm. Maglev has a costly downside for microservices, however: it is expensive to generate the lookup table when a node fails.

Fastest Response

Also called weighted response time, the fastest response method sends new connections to whichever server is currently offering the quickest response to new requests or connections. Fastest response is usually measured as time to first byte.

This method works well when the servers in the pool are processing short-lived connections or they contain varying capabilities. HTTP 404 errors are generally the sign of a server having problems, such as a lost connection to a data store. Frequent health checks can help mitigate against this kind of issue.

Fewest Servers

A fewest servers strategy determines the fewest number of servers required to satisfy current client load rather than distributing all requests across all servers. Servers deemed excess can either be powered down or de-provisioned, temporarily.

This kind of algorithm works by monitoring changes in response latency as the load adjusts based on server capacity. The Kubernetes load balancer sends connections to the first server in the pool until it is at capacity, and then sends new connections to the next available server. This algorithm is ideal where virtual machines incur a cost, such as in hosted environments.

Least Connections

The least connections dynamic Kubernetes load balancing algorithm distributes client requests to the application server with the least number of active connections at time of request. This algorithm takes active connection load into consideration, since an application server may be overloaded due to longer lived connections when application servers have similar specifications.

The weighted least connection algorithm builds on the least connection method. The administrator assigns a weight to each application server to account for their differing characteristics based on various criteria that demonstrate traffic-handling capability.

The least connections algorithm is generally adaptive to slower or unhealthy servers, yet offers equal distribution when all servers are healthy. This algorithm works well for both quick and long lived connections.

Resource Based/Least Load

Resource based or least load algorithms send new connections to the server with the lightest load, irrespective of the number of connections it has. For example, a load balancer receives one HTTP request requiring a 200-kB response and a second request that requires a 1-kB response. It sends the first to one server and the second to another. For new requests, the server then estimates based on old response times which server is more available—the one still streaming 200 kB or the one sending the 1-kB response to ensure a quick, small request does not get queued behind a long one. However, for non-HTTP traffic, this algorithm will default to the least connections method because the least load is HTTP-specific.

Does VMware NSX Advanced Load Balancer Offer a Kubernetes Load Balancer?

Microservices-based modern application architectures have rendered appliance-based load balancing solutions obsolete. Containerized applications deployed in Kubernetes clusters need scalable and enterprise-class Kubernetes Ingress Services for load balancing, monitoring/analytics service discovery, global and local traffic management, and security.

The VMware NSX Advanced Load Balancer Kubernetes ingress controller with multi-cloud application services offers high levels of automation based on machine learning, enterprise-grade features, and the observability that can usher container-based applications into enterprise production environments.

Based on a software-defined, scale-out architecture, VMware NSX Advanced Load Balancer provides container services for Kubernetes beyond typical Kubernetes service controllers, such as security, observability, traffic management, and a rich set of application maintenance and rollout tools. The centrally orchestrated, elastic proxy services fabric from VMware NSX Advanced Load Balancer provides analytics, dynamic load balancing, micro-segmentation, security, and service discovery for containerized applications running in Kubernetes environments.

The VMware NSX Advanced Load Balancer offers a cloud-native approach and delivers a scalable, enterprise-class container ingress to deploy and manage container-based applications in production environments accessing Kubernetes clusters. VMware NSX Advanced Load Balancer provides a container services fabric with a centralized control plane and distributed proxies:

  • Controller: A central control, management and analytics plane that communicates with the Kubernetes control plane, deploys and manages the lifecycle of data plane proxies, configures services and aggregates telemetry analytics from the Service Engines.
  • Service Engine: A service proxy providing ingress services such as load balancing, WAF, GSLB, IPAM/DNS in the dataplane and reporting real-time telemetry analytics to the Controller.


The VMware NSX Advanced Load Balancer extends L4-L7 services with automation, elasticity/autoscaling and continuous delivery onto Kubernetes Platform-as-a-Service (PaaS). Also, VMware NSX Advanced Load Balancer provides unprecedented visibility into Kubernetes applications showing service dependencies using application maps.

Find out more about VMware NSX Advanced Load Balancer’s Kubernetes ingress controller load balancer here.

For more on the actual implementation of load balancers, check out our Application Delivery How-To Videos.

Kubernetes Architecture

<< Back to Technical Glossary

Configure the Kubernetes API server securely. Disable anonymous/unauthenticated access and use TLS encryption for connections

Kubernetes Architecture Definition

Kubernetes is an open source container deployment and management platform. It offers container orchestration, a container runtime, container-centric infrastructure orchestration, load balancing, self-healing mechanisms, and service discovery. Kubernetes architecture, also sometimes called Kubernetes application deployment architecture or Kubernetes client server architecture, is used to compose, scale, deploy, and manage application containers across host clusters.

An environment running Kubernetes consists of the following basic components: a control plane (Kubernetes control plane), a distributed key-value storage system for keeping the cluster state consistent (etcd), and cluster nodes (Kubelets, also called worker nodes or minions).


Image depicts a Kubernetes Architecture diagram with the different components like control plane, nodes, pods and more.


Kubernetes Architecture FAQs

What is Kubernetes Architecture?

A Kubernetes cluster is a form of Kubernetes deployment architecture. Basic Kubernetes architecture exists in two parts: the control plane and the nodes or compute machines. Each node could be either a physical or virtual machine and is its own Linux environment. Every node also runs pods, which are composed of containers.

Kubernetes architecture components or K8s components include the Kubernetes control plane and the nodes in the cluster. The control plane machine components include the Kubernetes API server, Kubernetes scheduler, Kubernetes controller manager, and etcd. Kubernetes node components include a container runtime engine or docker, a Kubelet service, and a Kubernetes proxy service.

Kubernetes Control Plane

The control plane is the nerve center that houses Kubernetes cluster architecture components that control the cluster. It also maintains a data record of the configuration and state of all of the cluster’s Kubernetes objects.

The Kubernetes control plane is in constant contact with the compute machines to ensure that the cluster runs as configured. Controllers respond to cluster changes to manage object states and drive the actual, observed state or current status of system objects to match the desired state or specification.

Several major components comprise the control plane: the API server, the scheduler, the controller-manager, and etcd. These core Kubernetes components ensure containers are running with the necessary resources in sufficient numbers. These components can all run on one primary node, but many enterprises concerned about fault tolerance replicate them across multiple nodes to achieve high availability.

Kubernetes API Server

The front end of the Kubernetes control plane, the API Server supports updates, scaling, and other kinds of lifecycle orchestration by providing APIs for various types of applications. Clients must be able to access the API server from outside the cluster, because it serves as the gateway, supporting lifecycle orchestration at each stage. In that role, clients use the API server as a tunnel to pods, services, and nodes, and authenticate via the API server.

Kubernetes Scheduler

The Kubernetes scheduler stores the resource usage data for each compute node; determines whether a cluster is healthy; and determines whether new containers should be deployed, and if so, where they should be placed. The scheduler considers the health of the cluster generally alongside the pod’s resource demands, such as CPU or memory. Then it selects an appropriate compute node and schedules the task, pod, or service, taking resource limitations or guarantees, data locality, the quality of the service requirements, anti-affinity and affinity specifications, and other factors into account.

Kubernetes Controller Manager

There are various controllers in a Kubernetes ecosystem that drive the states of endpoints (pods and services), tokens and service accounts (namespaces), nodes, and replication (autoscaling). The controller manager—sometimes called cloud controller manager or simply controller—is a daemon which runs the Kubernetes cluster using several controller functions.

The controller watches the objects it manages in the cluster as it runs the Kubernetes core control loops. It observes them for their desired state and current state via the API server. If the current and desired states of the managed objects don’t match, the controller takes corrective steps to drive object status toward the desired state. The Kubernetes controller also performs core lifecycle functions.


Distributed and fault-tolerant, etcd is an open source, key-value store database that stores configuration data and information about the state of the cluster. etcd may be configured externally, although it is often part of the Kubernetes control plane.

etcd stores the cluster state based on the Raft consensus algorithm. This helps cope with a common problem that arises in the context of replicated state machines and involves multiple servers agreeing on values. Raft defines three different roles: leader, candidate, and follower, and achieves consensus by electing a leader.

In this way, etcd acts as the single source of truth (SSOT) for all Kubernetes cluster components, responding to queries from the control plane and retrieving various parameters of the state of the containers, nodes, and pods. etcd is also used to store configuration details such as ConfigMaps, subnets, and Secrets, along with cluster state data.

Kubernetes Cluster Architecture

Managed by the control plane, cluster nodes are machines that run containers. Each node runs an agent for communicating with the control plane, the kubelet—the primary Kubernetes controller. Each node also runs a container runtime engine, such as Docker or rkt. The node also runs additional components for monitoring, logging, service discovery, and optional extras.

Here are some Kubernetes cluster components in focus:


A Kubernetes cluster must have at least one compute node, although it may have many, depending on the need for capacity. Pods orchestrated and scheduled to run on nodes, so more nodes are needed to scale up cluster capacity.

Nodes do the work for a Kubernetes cluster. They connect applications and networking, compute, and storage resources.

Nodes may be cloud-native virtual machines (VMs) or bare metal servers in data centers.

Container Runtime Engine

Each compute node runs and manages container life cycles using a container runtime engine. Kubernetes supports Open Container Initiative-compliant runtimes such as Docker, CRI-O, and rkt.

Kubelet service

Each compute node includes a kubelet, an agent that communicates with the control plane to ensure the containers in a pod are running. When the control plane requires a specific action happen in a node, the kubelet receives the pod specifications through the API server and executes the action. It then ensures the associated containers are healthy and running.

Kube-proxy service

Each compute node contains a network proxy called a kube-proxy that facilitates Kubernetes networking services. The kube-proxy either forwards traffic itself or relies on the packet filtering layer of the operating system to handle network communications both outside and inside the cluster.

The kube-proxy runs on each node to ensure that services are available to external parties and deal with individual host subnetting. It serves as a network proxy and service load balancer on its node, managing the network routing for UDP and TCP packets. In fact, the kube-proxy routes traffic for all service endpoints.


Until now, we have covered concepts that are internal and infrastructure-focused. In contrast, pods are central to Kubernetes because they are the key outward facing construct that developers interact with.

A pod represents a single instance of an application, and the simplest unit within the Kubernetes object model. However, pods are central and crucial to Kubernetes. Each pod is composed of a container or tightly coupled containers in a series that logically go together, along with rules that control how the containers run.

Pods have a limited lifespan and eventually die after upgrading or scaling back down. However, although they are ephemeral, pods can run stateful applications by connecting to persistent storage.

Pods are also capable of horizontal autoscaling, meaning they can grow or shrink the number of instances running. They can also perform rolling updates and canary deployments.

Pods run together on nodes, so they share content and storage and can reach other pods via localhost. Containers may span multiple machines, so pods may as well. One node can run multiple pods, each collecting multiple containers.

The pod is the core unit of management in the Kubernetes ecosystem and acts as the logical boundary for containers that share resources and context. Differences in virtualization and containerization are mitigated by the pod grouping mechanism, which enables running multiple dependent processes together.

Achieve scaling in pods at runtime by creating replica sets, which deliver availability by constantly maintaining a predefined set of pods, ensuring that the deployment always runs the desired number. Services can expose a single pod or a replica set to external or internal consumers.

Services associate specific criteria with pods to enable their discovery. Pods and services are associated through key-value pairs called selectors and labels. Any new match between a pod label and selector will be discovered automatically by the service.

Additional Kubernetes Web Application Architecture Components

Kubernetes manages an application’s containers, but it can also manage a cluster’s attached application data. Kubernetes users can request storage resources without knowing underlying storage infrastructure details.

A Kubernetes volume is just a directory that is accessible to a pod, which may hold data. The contents of the volume, how it comes to be, and the medium that backs it are determined by the volume type. Persistent volumes (PVs) are specific to a cluster, are generally provisioned by an administrator, and tie into an existing storage resource. PVs can therefore outlast a specific pod.

Kubernetes relies on container images that it stores in a container registry. It can be a third party registry or one an organization configures.

Namespaces are virtual clusters inside a physical cluster. They are intended to provide virtually separated work environments for multiple users, teams, and prevent teams from hindering each other by limiting what Kubernetes objects they can access.
At the pod level, Kubernetes containers within a pod can reach other ports via localhost and share their IP addresses and network namespaces.

Kubernetes Architecture Best Practices

Kubernetes architecture is premised on availability, scalability, portability, and security. Its design is intended to distribute workloads across available resources more efficiently, optimizing the cost of infrastructure.

High Availability

Most container orchestration engines deliver application availability, but Kubernetes high availability architecture is designed to achieve availability of both applications and infrastructure.

Kubernetes architecture ensures high availability on the application front using replication controllers, replica sets, and pet sets. Users can set the minimum number of running pods at any time. The declarative policy can return the deployment to the desired configuration if a pod or container crashes. Configure stateful workloads for high availability using pet sets.

Kubernetes HA architecture also supports infrastructure availability with a wide range of storage backends, from block storage devices such as Google Compute Engine persistent disk and Amazon Elastic Block Store (EBS), to distributed file systems such as GlusterFS and network file system (NFS), and specialized container storage plugins such as Flocker.

Moreover, each Kubernetes cluster component can be configured for high availability. Health checks and load balancers can further ensure availability for containerized applications.


Applications deployed in Kubernetes are microservices, composed of many containers grouped into series as pods. Each container is logically designed to perform a single task.

Kubernetes 1.4 supports cluster auto-scaling, and Kubernetes on Google Cloud also supports auto-scaling. During auto-scaling, Kubernetes and the underlying infrastructure coordinate to add additional nodes to the cluster when no available nodes remain to scale pods across.


Kubernetes is designed to offer choice in cloud platforms, container runtimes, operating systems, processor architectures, and PaaS. For example, you can configure a Kubernetes cluster on various Linux distributions, including CoreOS, Red Hat Linux, CentOS, Fedora, Debian, and Ubuntu. It can be deployed to run locally, in a bare metal environment; and in virtualization environments based on vSphere, KVM, and libvirt. Serverless architecture for Kubernetes can run on cloud platforms such as Azure, AWS, and Google Cloud. It’s also possible to create hybrid cloud capabilities by mixing and matching clusters on-premises and across cloud providers.


Kubernetes application architecture is configured securely at multiple levels. For a detailed look at Kubernetes Security, please see our discussion here.

Configuring Kubernetes Architecture Security

To secure Kubernetes clusters, nodes, and containers, there are several best practices based on DevOps practices and cloud-native principles to follow:

Update Kubernetes to the latest version. Only the latest three versions of Kubernetes are supported with security patches for newly identified vulnerabilities.

Configure the Kubernetes API server securely. Deactivate anonymous/unauthenticated access and use TLS encryption for connections between the API server and kubelets.

Secure etcd. etcd itself is a trusted source, but serves client connections only over TLS.

Secure the kubelet. Deactivate anonymous access to the kubelet. Start the kubelet with the –anonymous-auth=false flag and limit what the kubelet can access with the NodeRestriction admission controller.

Embed security early in the container lifecycle. Ensure shared goals between DevOps and security teams.

Reduce operational risk using Kubernetes-native security controls. When possible, leverage native Kubernetes controls to enforce security policies so your own security controls and the orchestrator don’t collide.

Does VMware NSX Advanced Load Balancer Offer Services for Kubernetes Container Architecture?

A modern, distributed application services platform is the only option for delivering an ingress gateway for applications based on Kubernetes microservices architecture. For web-scale, cloud-native applications deployed using container technology as microservices, traditional appliance-based ADC solutions are not up to the task of managing Kubernetes container clusters. Each can have hundreds of pods with thousands of containers, mandating policy driven deployments, full automation, and elastic container services.

The VMware NSX Advanced Load Balancer’s Kubernetes ingress services provide enterprise-grade application services including ingress controller, LB, WAF, and GSLB for distributed apps (both traditional & cloud-native) beyond containers to VMs and bare metal. The VMware NSX Advanced Load Balancer helps simplify operations for production ready clusters across multi-cloud, multi-region, and multi-infra environments. Learn how to deploy and automate here.

For more on the actual implementation of load balancing, security applications and web application firewalls check out our Application Delivery How-To Videos.

Keyed Hash Message Authentication Code (HMAC) Definition

In cryptography, a keyed hash message authentication code (HMAC) is a specific type of message authentication code (MAC) involving a cryptographic hash function(hence the ‘H’) in combination with a secret cryptographic key. As with any MAC, it may be used to simultaneously verify both the data integrity and the authentication of amessage. Any cryptographic hash function, such as MD5 or SHA-1, may be used in the calculation of an HMAC; the resulting MAC algorithm is termed HMAC-MD5 or HMAC-SHA1 accordingly. The cryptographic strength of the HMAC depends upon thecryptographic strength of the underlying hash function, the size of its hash output, and on the size and quality of the key.