Storage Tip: Application classification is not data classification

October 23, 2006, 11:22 AM —  ITworld.com — 


Send your Storage question to David Hill today! | See other Storage tips from David



Digg!


What seems to be the problem? You want to do information lifecycle management (ILM). ILM is the policy-driven process of managing information as it changes value throughout the full range of its lifecycle from conception to disposition. To do this, businesses must classify the data that they wish to manage in order to determine when data has changed "value." You must understand more about the data classification process: what it is, why you should do it, and how you go about doing it. And that requires an understanding of what data classification is, and is not.



What do you need to know? Data classification is the process of separating data into separate piles (i.e. categories) to which different policies apply. Different categories are treated differently. Different service levels apply to each category, such as service levels for data protection, data security, and compliance. Different categories may -- or may not -- be on different tiers of storage. Data is migrated (i.e. moved) from one pool to another when its data classification status changes. If no service level difference exists between two pools of data, they can be collapsed into one pool from a service level perspective (even though the data is not physically commingled from an application perspective).



From a storage-centric perspective, data classification organizes data so that IT can manage it more easily. For example, as a result of data classification tiered storage may be used with resulting cost savings because one tier is typically less expensive. Also, data classification may identify data that can be destroyed. Getting rid of unnecessary data means that less space is used on existing arrays. That means that those arrays can accommodate more data before new storage has to be purchased.



From a business perspective, one business use of data classification is isolating compliance data so that the data can be managed effectively. Having the same application manage a commingled pool of compliance and non-compliance data while conceivably feasible would put an added code burden on the application. The application would have to have the appropriate logic for handling both compliance and non-compliance; that extra code could be burdensome to write and to maintain. Separating the data into two pools -- each controlled by a different application -- is preferable. Note that this does not mean that the original application may not be able to read (and hence use) the compliance data for business purposes, such as customer history analysis, but it means that the original application no longer has the ability to update or delete the compliance data.



What are your choices? One way of doing classification is by application. One way of classifying applications is as mission-critical, business-critical, task-critical, etc. Service Level Objectives (SLOs) are set for the application. Typically, all the data that the application controls has the same recovery point objective (RPO) for operational recovery, i.e. the acceptable data loss for the application in the event of a data protection problem. This is a simple approach and the business can probably benefit from doing application classification, such as spending scarce budget dollars to provide an application with a higher level of availability than is really necessary. However, application classification is not data classification.



The reason is that data does not belong to just one application over its lifecycle. Classifying data is necessary because that determines which application should have control at that stage of its lifecycle. For example, e-mails that are necessary to be retained for compliance reasons should not be allowed to be under the control of an application that can delete them. Not classifying data would imply that the value of data would never change throughout its lifecycle.



Value, in this sense, might be considered the change in utility that occurs when a piece of data has the purpose for which it is intended change. Consider, for example, a commercial business transaction. An open customer order goes through an order fulfillment and billing process. When the customer has the product, and payment has been received, the sale becomes a closed transaction. The information in a closed transaction may still serve useful business purposes -- for service and support, for analyzing customer buying behavior, and for financial reconciliation among them. But the original purpose of the data as necessary for revenue generating order fulfillment has changed.



So you can classify by application if you wish, but do not think that you have done data classification as well. You must separate data into different classes in order to obtain the benefits of ILM, such as tiering storage. You may not get all the benefits right away, but you have to get started somewhere.

 

ITworld.com

I like it!
Post a comment
The content of this field is kept private and will not be shown publicly.
  • Allowed HTML tags: <a> <em> <strong> <cite> <code> <ul> <ol> <li> <dl> <dt> <dd>
  • Lines and paragraphs break automatically.
Resources
White Paper

Symantec Backup Exec 12 and Backup Exec System Recovery 8 deliver industry leading Windows data protection and system recovery. Download this whitepaper to find out the top reasons to upgrade and how to get continuous data protection and complete system recovery.

Webcast

Data and system loss — from a hard drive failure, malicious attack, natural disaster, or simple human error — can happen anytime. Don’t leave your business vulnerable. Make sure you have a secure recovery strategy in place. Symantec's latest backup and system recovery technology can efficiently restore critical applications, individual emails and documents and even restore your entire system in minutes in the event of a loss.

White Paper

Businesses face a growing challenge to ensure that the IT environment is properly protected. Backup Exec 12 integrates with other applications in the Symantec family of products, to complement your current data protection strategy, keep your data securely backed up and make it recoverable when you need it most.

Free stuff

Enterprise 2.0 Implementation
By Aaron C. Newman, Jeremy Thomas
Published by McGraw-Hill
Learn more!

Deploying Cisco Wide Area Application Services
By Zach Seils, Joel Christner
Published by Cisco Press
Learn more!

Featured Sponsor

AISO founders envisioned a Web hosting company that was environmentally friendly. While the company employed energy-efficient innovations like solar panels, its infrastructure produced unacceptable power and cooling requirements. Find out how AISO leveraged AMD technology to overcome their challenge in this case study white paper.

In this whitepaper, Scalar explores the opportunity to change the landscape with respect to mission critical databases built around Oracle. Leveraging technologies such as Linux, high-end commodity processing power and Oracle RAC technology to architect, design, build and maintain database infrastructure that delivers maximum availability, reliability and performance at a fraction of traditional cost.

On a typical day, weather.com, the Web site for The Weather Channel in Atlanta, serves up between 15 million and 20 million page views. But in September 2004, when back-to-back hurricanes ransacked Florida, the peak traffic on one day more than tripled: over 70 million page views by more than 7 million unique visitors. Read the full success story now.

More Resources