From: www.itworld.com

12 Tips for Successful VMWare-based Virtualization

by Andrew Hillier

November 28, 2007 —

 

VMware's virtualization solutions promise greater hardware utilization and
flexibility. Yet, like all virtualization technologies, they carry a certain
amount of risk. Stack too many servers and you achieve significant financial
gain but incur the risk of operational incidents and performance problems in
your production environment. Ignore the business-related aspects of your production
environment that aren't present in the lab and you may put critical apps at
great risk. Rely too heavily on some of the advanced automation features of
VMWare without the proper planning and you could wind up with bigger problems
than those you were trying to solve in the first place.

Proper planning is at the heart of any technology optimization initiative and
this applies to VMWare as well. Large scale virtualization necessitates a data-driven
approach, carefully evaluating elements such as business considerations, technical
constraints, and workload patterns. Things in the VMWare world are very fluid,
so it's not only important to achieve an optimal initial placement of virtual
machines, but also to understand how to keep the environment optimized..

Virtualization planning can be very complex if not using the proper planning
tools, but regardless of the approach, organizations should ensure that they
are following some basic guidelines during the process.

Watch for Technical Factors that May Introduce Risk: Be careful when
combining servers that have differing configurations, diverse underlying platforms,
or varying network/storage connectivity. Combining servers that touch too many
networks onto a single physical host can drive up costs through the increase
of NICs and PCI extenders (blade racks are particularly sensitive to this).
Be sure to uncover any hardware or configurations of interest, such as SAN controllers,
token ring cards, IVRs, proprietary daughterboards, direct-connect printers,
or other items that are not part of the standard build. This process, called
variance analysis, reveals hardware configuration affinities and "outliers",
which ultimately helps avoid any interruption of critical business services
during the virtualization process.

Consider the Key Business Constraints that Govern the Environment: Consider
real-world business constraints, such as availability targets, maintenance windows,
application owners, compliance restrictions, disaster recovery strategies, and
other business sensitivities. Most small-scale virtualization planning doesn't
go beyond simple workload analysis, yet any foray into larger production environments
will show that it is very important to dig much deeper. For example, it's not
unheard of to combine virtualization candidates based solely on utilization
data and end up with a dysfunctional environment where there is not a single
time in the calendar when the physical server can actually be shut down for
maintenance. Considering the maintenance windows of the applications in the
planning phase will avoid such problems, and it is not always wise to rely on
Vmotion to get out of a jam. Likewise, mixing different availability levels
can either create risk or waste expensive hardware.

Tackle the Political and Financial Ramifications of Sharing Infrastructure:
Another issue to consider is the politics of these consolidation decisions.
Application owners may have real or perceived reasons why they cannot share
infrastructure and these often translate into additional constraints and/or
what-if analysis in order to resolve the issues. In addition, most chargeback
models aren't sophisticated enough to deal with virtualized infrastructure and
will break down if resource sharing crosses certain boundaries. Using affinity
regions based on departments and application owners may be a wise decision in
cases where political or financial considerations pose a challenge.

Be Exhaustive When it Comes to Workload Patterns and Personalities:
Everyone wants to maximize savings, but there is a trade-off between risk and
return when virtualizing existing environments. What is acceptable in a lab
is usually not the same as what is required in production, and the risk of performance
degradation is often a key consideration when determining the target utilization
in a virtual environment. It's vital that organizations understand this and
properly evaluate workload patterns to determine their own comfort with savings,
stacking ratios and operational risk levels. Some of the most important aspects
of workload analysis, such as complementary pattern detection and time-shift
what-if analysis, are often overlooked when determining if workloads can be
combined. Looking at these areas in depth and across all the major CPU, I/O
and resource capacity food groups, helps ensure that you've maximized utilization
while leaving enough headroom to cushion peak demands on the infrastructure.

Understand the Overhead of Virtualization: When analyzing for VMWare
consolidation in particular, you need to look at the overhead created by the
virtual machine. Unlike physical servers, VMWare virtual servers create CPU
overhead when data is sent to the disk or over the network. Typically organizations
build in a fixed-percentage overhead when planning virtual environments, but
this approach can sell systems short. The best approach is to properly analyze
I/O rates and project a more accurate utilization curve that factors in application
workload as well as the true overhead introduced by virtualization.



Analyze Constraints Together, Not in Isolation: Don't plan virtualization
based on any one constraint viewed in isolation. It's important to consider
all the critical constraints together when choosing targets. Taking a one-dimensional
analysis of workload, for example, will not only limit your success, but can
cause critical performance, security and compliance issues. Organizations should
be taking a multi-dimensional look at the net effect of all of the key constraints
applied to the pool of potential resources in order to determine the optimal
path to a virtualized infrastructure.

Don't Go Backwards When it Comes to Security and Compliance: Ensure
that as machines are virtualized they are not breaching compliance rules. For
example, regulations regarding information sharing between divisions within
financial services or healthcare operations necessitate that certain applications
and databases be kept separate. Keeping systems apart from their disaster recovery
counterparts or cluster/replication peers is also critical. In addition, security
zones should be maintained unless the organization has a clear mandate to redefine
what can cohabitate in an environment and/or on a physical system.

Understand the New Roles Introduced by Virtualization: ESX administrators
are a new breed of IT professional, and the fact that they often have access
to the disk images of multiple virtual servers tends to give them broad visibility
into applications and their data. This sometimes creates a "super super"
user role that is unprecedented in many environments, and has the potential
to violate regulatory and internal compliance rules. Proper virtualization analysis
and planning looks for these vulnerabilities and providing a risk matrix that
helps the organization ensure continued compliance.

Don't Abuse VMotion: VMotion is an extremely powerful technology that
will undoubtedly revolutionize the way many environments are managed. That said,
it is not wise to use it as a crutch and rely on it to compensate for poor planning
or inadequate management of an environment. Purposefully creating sub-optimal
VM placements with the expectation that you can VMotion your way out of trouble
is rarely a good strategy, particularly in production environments. This creates
a 'try it and see' culture that encourages people to try out different combinations
and assume they can just reverse these out if they don't work.

Lay Explicit Ground Rules for DRS: VMware's Dynamic Resource Scheduler
(DRS) automatically motions servers according to workload balancing criteria,
and because it is not inherently aware of the technical and business constraints
on an environment, this can tend to scramble systems from a technical and business
perspective. To combat this effect, DRS supports affinity and anti-affinity
rules that are used to identify which systems should be kept together and which
should be kept apart. While good in principle, this system is difficult to program
without having a proper understanding of the relevant constraints. A convenient
byproduct of the constraint-based analysis described above is a complete map
of all relevant affinities and anti-affinities in a server cluster, providing
rules that eliminate potential conflicts and ensures that security zones, business
constraints, compliance issues, disaster recovery and chargeback systems are
all respected and that the virtualized infrastructure remains optimized over
time.

Model Plenty of "What If" Scenarios: Organizations should
test out scenarios leveraging analysis of business, technology and workload
constraints to better manage their pool of resources. Virtualization allows
capacity to be managed in aggregate-providing the potential to revolutionize
capacity planning. This makes it possible for businesses to explore a variety
of options for optimizing their environment. What would happen, for example,
if I virtualized multiple data centers together? Which servers are good candidates
for consolidation and will work best together? What is the difference between
putting these servers on blades versus rack-mount systems? Altering pre-conceived
notions of which servers should be included in an initiative or adjusting risk
levels can reveal new opportunities for savings.

Don't Get "Tunnel Vision" When it Comes to Virtualization:
Understand what alternatives exist to virtualization. Any virtualization initiative
should be part of an overall optimization program. Organizations need to recognize
that virtualization is just one of several strategies that can be put into place.
Java applications and J2EE content are already abstracted from their physical
environment, and database instances reside in a database server that isolates
them from the surrounding infrastructure. Given this, it may not be necessary
to virtualize these applications at the operating system level. Utilizing their
inherent scaling/clustering strategy may be more effective, both from a technical
and a financial perspective.

Conclusion

Virtualization planning is not just a sizing exercise. From a planning and
management perspective, virtualization is a multi-faceted challenge that can
quickly become political. A methodical and data-driven approach to assessing
and planning virtualization opportunities is the best way to drive out risk,
positively engage application owners, and ensure that success is achieved beyond
the "low hanging fruit". To that end, leveraging multi-dimensional
analysis of all critical constraints and carefully planning for the specific
technologies and platforms in use is key to assuring the success of virtualization
initiatives.