With the noise of cloud computing rising by the day, there are basic operational issues one should not forget – cloud or no cloud. One such issue is the discipline of ILM (Information Life-cycle Management). How do you manage data over its lifetime of many years and decades? Do you keep all data current which drastically impacts the performance of applications using them? As everyone knows the appetite for data is growing by leaps and bounds. Not far from now, “personal petabyte” is quite viable given the need to store audio and video stuff. A petabyte is one thousand terabytes which is 1000 gigabytes which is 1000 megabytes. Now do the math. A petabyte is ten to the power of 15 bytes. And 1000 petabytes is one “exabyte”. Back in 2002, one petabyte would have cost $2M, whereas in 2012 (ten years) its cost will be $2K. This is real Moore’s law in disk storage!
Most of enterprise business data is resident as structured data managed by DBMS (e.g. Oracle or DB2). There are production databases of the size of 100 plus terabytes , mostly in places such as Walmart’s data warehouse for retail transactions. Telcos also have huge databases for call records. With the growth in size, performance degradation is normal. Hence enterprises must create a multi-tiered archiving policy. For example, current data can be in active databases for 2-3 years, followed by 2-4 years of inactive data followed by several years of historical data. As we move further behind, such data can be part of cloud storage. But access is paramount even if data is stored in multiple levels. For compliance and legal reasons, historical data should be easily accessible at high speed with smart search.
Another aspect of ILM is management of copies of data. Some companies may need 8-20 copies of active data for test, development, disaster recovery, quality control, etc. A 200 GB database may end up like 1200 GBs of data with six copies. Such issues are normally not reflected as part of planning, but IT shops get shocked when they see such numbers and the associated cost factors. Anther area at many enterprises is the “application retirement” issue. This happens with M&A or as a precursor to move into the cloud. This area is addressed in a very adhoc way resulting in unforeseen delays and cost. Any automation here should be highly welcome.
Gartner Group said this last year, “The return on the investment for implementing a structured data archiving solution is exceptionally high, especially for application retirement or when deployed for a packaged application for which vendor-supplied templates are available to ease implementation and maintenance.”
One company (I am an adviser) leading in this space is Solix that provides all the tools mentioned above. Their Enterprise Data Management System (EDMS) platform provides a comprehensive set of ILM tools for enterprises. Solix even introduced an appliance to ease the cost and administrative burdens for clients. The rapid adoption of Solix products is a testimony to the growing importance of data archiving, application retirement, data masking, and test data management.
ILM should be a well-thought-out discipline at every IT organization.