Starting Nearline 2.0: The Quick Check Approach
In previous posts, I introduced the concept of “Best Practices” for Nearline 2.0. Today, I will get down to the details of how and where to start with a Nearline 2.0 solution, beginning with a Best Practices approach designed to quickly identify the benefits of such an implementation in a given environment. At SAND Technology, we offer this “Nearline 2.0 Quick Check” as part of our professional services portfolio.
The Nearline 2.0 Quick Check normally takes about a week, and is designed to identify the main sources of data growth in the enterprise, the overall rate of growth, and the impact of this growth on existing Service Level Agreements (SLAs). It also involves describing data update processes in terms of frequency and dependencies. Based on this information, the concrete benefits of Nearline 2.0 for the organization can be clearly defined, along with the potential effects on TCO of not adopting or delaying the implementation. The key by-product of this exercise is a business case report that can be presented to corporate management.
The first step in the Nearline 2.0 Quick Check involves installing a set of measurement tools to identify the overall size of each data warehouse and data mart used by the enterprise, along with the annual growth rate of each (note that it is not unusual for organizations to have more that one data warehouse in production at a given time, for a variety of reasons). A first set of metrics, generally extracted by the operational staff, provides a general analysis of storage usage over a specific period. The second set (pertaining to database objects) can be provided by the DBA team or collected using the Quick Check measurement tools. After discussions with key personnel, the collected metrics are used as the basis for estimating the TCO savings that would result from implementing various Nearline 2.0 scenarios.
The final set of metrics collected during the Nearline 2.0 Quick Check relates to organizational SLAs and TCO. Batch windows are analyzed to identify the major points of contention, and the processes that require long execution times and place a heavy load on processors, the network and the I/O subsystem. Frequently, the operational team will already know which processes are causing them nightmares, and can say whether they occur on a daily, weekly, monthly, quarterly or annual base. TCO can be more difficult to evaluate, and for this the reason needs to be examined with the assistance of organizational management.
The Nearline 2.0 Advantage
Of course, at the heart of this approach is the recently developed Nearline 2.0 concept, with all the new data management scenarios this has enabled. As I mentioned in a previous post, older Nearline 1.0 or archiving solutions presented major drawbacks, in that data removed from the online environment became very difficult and costly, if not impossible, to access. Because of this, determining the precise timing of data migration to a Nearline solution or Archive is a critical decision that can have a major impact on the organization.
A Nearline 2.0 solution, because of its performance characteristics and unique internal architecture, enables implementation of Nearline scenarios designed to deliver the highest return on investment and reduction of TCO, without giving up high-performance, flexible access to data for analytic purposes. Elaboration of the scenario doesn’t require any in-depth analysis of data access patterns. Rather, the data can be “nearlined” as soon as it becomes “static”, meaning that no more updates are planned for it. Analysis of update requirements will show that some data records are never updated after their creation, and can therefore be considered for nearlining right away — this is frequently the case with CDR, RFI or transaction log data. Data of this sort offers some very interesting data modeling options in a Nearline 2.0 solution, a topic I will be covering in my next post.
Quick Check Results
Once the Nearline 2.0 Quick Check is complete, a report can be prepared describing the current situation and where the enterprise will be in 6 months, in 1 year and so on. What is the expected cost to the enterprise of supporting the current rate of growth? What would be the effect of implementing various Nearline 2.0 scenarios? Armed with this information, the management team will have clear facts on which to base a decision to implement Nearline 2.0. And, based on my experience over the last few years, I’m pretty sure that they will decide to proceed earlier rather than later.