Application Visibility Definition

Application visibility is the ability to gain insights into application performance, health, security and other such aspects of an application.  These insights enable application developers and owners in continuously optimizing app performance to match its usage patterns.

How does VMware NSX Advanced Load Balancer provide Application Visibility?

VMware NSX Advanced Load Balancer leverages its strategic placement as a load balancer in the application traffic path.  The Service Engines are the distributed load balancers that can be deployed closest to the applications to enable optimal performance.  These Service Engines collect real-time telemetry on the application performance, usage, health, and security posture.  These metrics are then sent to the centralized Controller which analyzes and processes them to provide visual dashboards to application owners.

Application Performance Definition

Application performance indicates how the app is functioning and how responsive the app is to the end-user. There are tools available to measure Application Performance and enable app developers to detect and diagnose complex application performance problems to maintain the expected level of service to end-users.

How does VMware NSX Advanced Load Balancer help deliver a rich Application Performance?

VMware NSX Advanced Load Balancer leverages its strategic position in the network. Being in application traffic path, the Service Engines (the distributed load balancers) can collect real-time application telemetry. These insights provide administrators with intelligence into the Application Performance and enable them to provide a rich end-user experience.

Auto Scaling

<< Back to Technical Glossary

Auto Scaling Definition

Auto scaling, also referred to as autoscaling, auto-scaling, and sometimes automatic scaling, is a cloud computing technique for dynamically allocating computational resources. Depending on the load to a server farm or pool, the number of servers that are active will typically vary automatically as user needs fluctuate.

Auto scaling and load balancing are related because an application typically scales based on load balancing serving capacity. In other words, the serving capacity of the load balancer is one of several metrics (including cloud monitoring metrics and CPU utilization) that shapes the auto scaling policy.

Diagram depicts auto scaling, a cloud computing technique for dynamically allocating computational resources.
Autoscaling FAQs

What is Auto Scaling in Cloud Computing?

Autoscaling is a cloud computing feature that enables organizations to scale cloud services such as server capacities or virtual machines up or down automatically, based on defined situations such as traffic ir utilization levels. Cloud computing providers, such as Amazon Web Services (AWS), Microsoft Azure, and Google Cloud Platform (GCP), offer autoscaling tools.

Core autoscaling features also allow lower cost, reliable performance by seamlessly increasing and decreasing new instances as demand spikes and drops. As such, autoscaling provides consistency despite the dynamic and, at times, unpredictable demand for applications.

The overall benefit of autoscaling is that it eliminates the need to respond manually in real-time to traffic spikes that merit new resources and instances by automatically changing the active number of servers. Each of these servers requires configuration, monitoring and decommissioning, which is the core of autoscaling.

For instance, when such a spike is driven by a distributed denial of service (DDoS) attack, it can be difficult to recognize. More efficient monitoring of autoscaling metrics and better autoscaling policies can sometimes help a system respond quickly to this issue. Similarly, an auto scaling database automatically scales capacity up or down, starts up, or shuts down based on the needs of an application.

Important Auto Scaling Terminologies

An instance is a single server or machine that is subject to auto scaling rules created for a group of machines. The group itself is an auto scaling group, with each instance in the group subject to those auto scaling policies.

For example, the Elastic Compute Cloud (EC2) is the compute platform of the AWS ecosystem. EC2 instances offer scalable, customizable server options within the AWS cloud. Amazon EC2 instances are virtual, elastically scaled on demand, and seamless to the end user.

An auto scaling group is a logical collection of Amazon EC2 instances for automatic scaling purposes. Each Amazon EC2 instance in the group will be subject to the same auto scaling policies.

Auto scaling group size refers to the number of instances in the auto scaling group. Desired capacity or size is the ideal number of instances in that auto scaling group. If there is a difference between those two numbers, the auto scaling group can either instantiate (provision and attach) new instances or remove (detach and terminate) instances.

Depending on the rules and auto scaling algorithms controlling a particular auto scaling group, minimum and maximum size threshold values set cutoff points above or below which instance capacity should not rise or fall. An auto scaling policy also often specifies any changes to the auto scaling group’s desired capacity in response to metrics breaching set thresholds.

Auto scaling policies often have associated cooldown periods to ensure the entire system continues to manage traffic. Auto scaling cooldown periods add time after specific scaling actions to allow newly instantiated instances time to begin to manage traffic.

Changes to an auto scaling group’s desired capacity might be fixed or incremental. Fixed changes simply provide a desired capacity value. Incremental changes decrease or increase by a specific number rather than setting an end value. Scaling up policies, also called scaling out policies, increase desired capacity. Scaling down policies, also called scaling in policies, decrease desired capacity.

To determine whether attached instances are functioning properly, an auto scaling group conducts a health check. Unhealthy instances should be marked for replacement.

Elastic load balancing software can conduct health checks. Amazon EC2 status checks are also an option, as are custom health checks. A positive health check can be based on whether the instance remains registered and in service with its associated load balancer, or whether the instance is still reachable and exists.

Launch configuration describes the scripts and parameters needed to launch a new instance. This includes the instance type, machine image, potential availability zones for launch, purchase options (such as spot vs. on-demand), and scripts to run upon launch.

What is the Difference Between Auto Scaling and Load Balancing?

Elastic load balancing and application autoscaling are closely related. Both application auto scaling and load balancing reduce backend tasks such as monitoring the health of servers, managing the traffic load among the servers, and increasing or reducing servers pursuant to requirements. In fact, it is common to see solutions that include a load balancer with autoscaling features. However, elastic load balancing and auto scaling are distinct concepts.

An application load balancer auto scaling package works in tandem as follows. You can deploy an auto scaling group load balancer to improve availability and performance, and decrease application latency. This works because you can define your autoscaling policies based on your application requirements to scale-in and scale-out instances and thus instruct how the load balancer distributes the traffic load between the running instances.

Autoscaling allows a user to set a policy based on predefined criteria that manage the number of available instances in both peak and off-peak hours. This enables multiple instances with the same functionality—parallel capabilities increasing or decreasing depending on demand.

In contrast, an elastic load balancer simply checks the health of each instance, distributes traffic, and connects each request to appropriate target groups. If it detects an unhealthy instance, an elastic load balancer stops traffic to that instance and sends data requests elsewhere. It also prevents any one instance from being swamped by requests.

Autoscaling with elastic load balancing works by attaching a load balancer and an autoscaling group so it can route all requests to all instances equally. This frees the user from monitoring the number of endpoints the instances create—another difference between autoscaling and load balancing in terms of how they work alone.

Advantages of Auto Scaling

Auto scaling provides several advantages:

Cost. When loads are low, auto scaling allows both companies managing their own infrastructure and businesses that rely on cloud infrastructure to send some servers to sleep. This reduces electricity costs and water costs where water is used in cooling. Cloud auto scaling also means paying for total usage instead of maximum capacity.

Security. Auto scaling also protects against application, hardware, and network failures by detecting and replacing unhealthy instances while still providing application resiliency and availability.

Availability. Auto scaling improves availability and uptime, especially when production workloads are less predictable.

While many businesses have a set daily, weekly, or yearly cycle to govern server use, auto scaling is different in that it reduces the chance of having too many or too few servers for the actual traffic load. This is because auto scaling is responsive to actual usage patterns, in contrast to a static scaling solution.

For example, a static scaling solution might rely on the idea that traffic is typically lower at 2:00 am, and send some servers to sleep at that time. However, in practice there may be spikes at that time—perhaps during a viral news event or other unexpected times.

Predictive Autoscaling vs Scheduled Autoscaling

By default, autoscaling is a reactive approach to decision making. It scales traffic as it responds in real-time to changes in traffic metrics. However, in certain situations, especially when changes happen very quickly, it may be less effective to take a reactive approach.

Scheduled autoscaling is a kind of hybrid approach to scaling policy that still functions in real-time, but also anticipates known changes in traffic loads and executes policy reactions to those changes at specific times. Scheduled scaling works best in cases where there are known traffic decreases or increases at particular times of day, but the changes in question are typically very sudden. Different from static scaling solutions, scheduled scaling keeps autoscaling groups “on notice” to respond quickly during key times with added capacity.

Predictive autoscaling deploys predictive analytics, including historical usage data and recent usage trends, to autoscale based on predictions about usage in the future. Predictive autoscaling is especially useful for:

  • Detecting large, imminent spikes in demand and readying capacity slightly in advance
  • Coping with large-scale, regional outages
  • Offering more flexibility in scaling out or in to respond to variable traffic patterns throughout the day

Horizontal Auto Scaling vs Vertical Auto Scaling

Horizontal auto scaling refers to adding more servers or machines to the auto scaling group in order to scale. Vertical auto scaling means scaling by adding more power rather than more units, for example in the form of additional RAM.

There are several issues to consider when considering horizontal auto scaling vs vertical auto scaling.

Because it involves adding more power to an existing machine, there are inherent architectural challenges to vertical auto scaling. There is no redundant server, and application health is tied to the machine’s single location. Vertical scaling also demands downtime for reconfigurations and upgrades. Finally, vertical auto scaling enhances performance, but not availability.

Decoupling application tiers may mitigate part of the vertical scaling challenge, because application tiers are likely to use resources and grow at different rates. Stateless servers are best able to handle requests for a better user experience, and add more instances to tiers. This also allows you to use elastic load balancing to scale incoming requests across instances more efficiently.

When there are thousands of users, vertical scaling cannot handle the requests. In these cases, horizontal auto scaling adds more machines to the resource pool. Load balancing, distributed file systems, and clustering are all part of effective horizontal auto scaling.

Stateless servers are important for applications which typically have many users. User sessions should ideally never be tied to one server, and should instead be able to move seamlessly across many servers while maintaining a single session. This kind of browser-side session storage allows for better user experience and is one of the results of good horizontal scaling.

A service-oriented architecture for applications should include logical blocks that interact yet are self-contained. This way, you can scale blocks out singly based on need. Tiers for applications, caching, database, and web should all be independent pieces of microservice architecture to save both vertical and horizontal scaling costs.

Horizontal auto scaling does not demand downtime, in that it creates independent new instances. It also increases availability as well as performance due to this independence.

Remember: vertical scaling doesn’t work for every organization or workload. Many users create a demand for horizontal scaling, and a single instance versus many smaller instances will perform differently on the same total resource depending on user needs.

For emergencies and unplanned events, the multiple instances created by horizontal auto scaling may increase resiliency. For this reason, it is often the preferred approach for cloud-based businesses.

Graph shows how VMware elastic autoscaling can scale vertically with more CPUs or horizontally with more Service Engines.

Does VMware NSX Advanced Load Balancer offer Auto Scaling Solutions?

The VMware NSX Advanced Load Balancer delivers predictive auto scaling and elastic load balancing. The Controller can scale both application instances and load balancers based on application usage and traffic patterns.

Legacy, appliance-based application delivery controllers (ADCs) can become the bottlenecks because their capacity is static and cannot scale dynamically. The VMware NSX Advanced Load Balancer resolves this issue by matching traffic patterns, enabling dynamic scaling of resources.

Learn how to enforce auto scale thresholds for vertical or horizontal resource scaling in real time.

For more on the actual implementation of load balancing, security applications and web application firewalls check out our Application Delivery How-To Videos.

Application Intelligence

Application intelligence is the ability to acquire and apply knowledge from insights such as application performance monitoring, end-to-end timing, application health score, and real-time traffic logs.

How does VMware NSX Advanced Load Balancer provide Application Intelligence?

The software-defined architecture of the VMware NSX Advanced Load Balancer enables Service Engines, the distributed load balancers, to be deployed closest to applications. These Service Engines collect real-time telemetry for parameters such as concurrent open connections, round-trip time (RTT), throughput, errors, response latency, SSL handshake latency, response types, etc. to provide insight and intelligence into the application performance, user experience, and network traffic. The Controller presents the Application Intelligence in a visual, actionable dashboard for administrators.

Application Delivery

<< Back to Technical Glossary

Application Delivery

Application delivery is a mechanism to deliver application functionality quickly and efficiently to users. Traditionally, hardware application delivery controllers (ADCs) were used to deliver applications. But cloud-native application delivery architectures (such as microservices) require a new application delivery solution — the software ADC. These solutions provide application delivery optimization by allowing enterprises to create a highly scalable application delivery model which makes application services available when required. It does this by automating the deployment of new ADCs when required.

Diagram depicting application delivery used by enterprise companies and small business to deliver software application to end users from their network on bare metal, virtualized, container and public cloud environments.
FAQs

What Is Application Delivery?

Application delivery refers to the pool of services that combine to provide application functionality, usually web-based software applications, from the data centers or cloud environments where the required data processing and computing is executed to the application clients or end-users over the internet. The services for delivering applications on a network infrastructure aim to provide a reliable user experience by providing, load balancing, security, latency and TCP optimizations which combine to provide application content seamlessly. Business IT teams’ role in application delivery focuses on how applications are architected and managed within the data center, and cloud hosting services.

What Is An Application Delivery Network?

An application delivery network (ADN) provides application availability, security, visibility and acceleration. The technologies are deployed together in a combination of WAN optimization controllers (WOCs) and application delivery controllers (ADCs). The application delivery controller distributes traffic among many servers. The WAN optimization controller uses caching and compression to reduce the number of bits that flow over a network. ADNs typically assist in the acceleration of content delivery, especially immediate and dynamic content such as online gaming and trading.

What Is Application Delivery Management?

Application delivery management is the discipline of achieving fast, predictable and secure access to applications. Application delivery management provides delivery solutions by ensuring vital enterprise applications are available and responsive for users. This requires application delivery optimization for throughput, connections per second, security, troubleshooting, and analytics.

Benefits of Modern Application Delivery Systems?

A cloud-native application delivery system offers the following IT benefits:

• Simplified infrastructure: Replaces hardware-based application servers with a public cloud service that is better equipped to scale globally without compromising delivery quality.

• Reduced costs: Companies spend less on customer support when the user experience improves with application performance. A cloud-native application delivery process also saves on hardware acquisition and maintenance costs.

• Increased productivity: Efficiency is optimized when employees can quickly access the information and services on applications from any device, anywhere. A cloud-native application delivery process makes it possible for applications to perform faster.

• Improved end-user experience: Customers will increasingly use and prefer high performance applications made possible by an efficient cloud-based application delivery process.

What Does An Application Delivery Manager Do?

An Application Delivery Manager is responsible for the availability and responsiveness of an organization’s applications. They can use agile project management techniques to deliver products under an efficient application delivery model. The application delivery manager leads the planning process and prioritizes resources based on team capacity to create application delivery solutions.

Does VMware NSX Advanced Load Balancer offer Application Delivery?

Yes. The VMware NSX Advanced Load Balancer delivers Intent-Based Application Services by automating intelligence and elasticity across any cloud. Software-defined application services from the VMware NSX Advanced Load Balancer provide a flexible solution that go beyond load balancing to deliver pinpoint analytics, predictive autoscaling, multi-cloud traffic management and a cloud security solution.

For more on the actual implementation of load balancing, security applications and web application firewalls check out our Application Delivery How-To Videos.

For more information on application delivery see the following resources: