Capacity Health Check & Analysis
CiRBA allows the health of a running environment to be assessed in several ways, including the detection of unusual or undesirable workload patterns and the identification of improperly configured resources. The efficiency of an environment is assessed through the detailed analysis of “whitespace,” which allows capacity allocations to be decomposed into spare capacity that is truly required, and spare capacity that is simply waste. CiRBA’s advanced workload modeling and constraint analysis is applied to continuously track capacity health and status within an environment. Some of the key areas where CiRBA’s analytics assist organizations in the ongoing analysis of capacity status are outlined below:
Identifying Operational Risks and Possible Service Impact
CiRBA’s advanced utilization analysis can automatically detect unusual, anomalous, or otherwise suspect workload patterns to identify and predict operational risks and potential business service impacts. Some of the problematic patterns detectable by CiRBA include the following:
- Spinning processes, which can often go undetected and may cause workload managers and load balancers to misfire and/or become confused
- High context switching, which may go undetected by management tools that only look at CPU and memory, and which can cause significant operational problems in virtual environments, particularly if several such workloads are placed onto the same physical host
- Workloads with “glass ceilings,” or unusually low peak utilization levels, which may be indicative of an under-allocation of resources (causing throttling of activity) or of single threaded applications that cannot leverage all the resources at their disposal
Identifying Waste and VM Sprawl
Sprawl in virtual environments is a very real problem faced by many organizations. CiRBA enables the tracking of VM’s, their configurations and their utilization in order to help prevent sprawl and waste including:
- Detecting idle systems by identifying a combination of low utilization and a lack of active connections with other systems
Identifying Improperly Configured Resources
Although improperly configured resources can sometimes be detected through unusual operational patterns, they may also be latent and difficult to detect. A good example of this improper configuration includes virtual machines that are allocated more memory than they require. Significant resources can be consumed by such erroneous configurations. CiRBA rules can be leveraged to detect any problematic configuration that may cause performance issues. Some areas where CiRBA can be applied include the folowing:
- Analyzing actual resource utilization versus allocations, to uncover over/under configured VMs based on historical usage patterns throughout the operational cycle
- Identifying vSMP risks caused by over/under allocation of virtual CPUs to a specific virtual machine
- Reconciling resource pool settings against the aggregate utilization of the resources within it, and if application-level metrics are available, the performance of the applications being hosted
Indentifying True Capacity Whitespace vs Waste
Whitespace, or spare capacity, can be divided into two parts: The extra capacity required to meet the varying needs of the workloads (sometimes referred to as “encumbered” or “reserved” whitespace), and waste (excess whitespace). The challenge is to identify the proper amount of reserved whitespace required for each environment. CiRBA enables organizations to determine and track whitespace as a key efficiency metric, and constantly eliminate any excess whitespace through the intelligent placement of workloads or the rebalancing and reallocation of physical hardware.

Fully Loaded Utilization
The ultimate measure of infrastructure efficiency is referred to as fully loaded utilization. It is the ratio of current utilization of capacity to the theoretical maximum utilization, given all the requirements, obligations, and constraints described previously. Alternately, it can be defined as the ratio of the minimum amount of server capacity that will safely support a business service to the amount of capacity currently in use. Efficiency is a moving target, and what is acceptable today may change tomorrow.
Another important aspect of this measure is that it takes into account the unique operational requirements of each environment. Many organizations attempt to use average utilization as an efficiency measure, but this can be misleading.
For example, a set of servers that support a trading application may run at 100% from 9:30-9:35am, but be far lower the rest of the day, and net out to an average of 5% utilization. Unfortunately, in this environment, 5% is as good as it gets. In this case, the fully loaded utilization figure would take into account these peak demands and have a value close to 100%, reflecting the fact that the environment is being run as efficiently as possible under the circumstances.
|