WinterCorp Logo


WinterCorp

245 First Street

Suite 1800

Cambridge, MA 02142

Phone: 617-695-1800

Fax: 617-848-3795

 

Contact us | Privacy

      

VLDB Vision
June 1998

Assessing the Feasibility of a
VLDB Solution
Is your project possible?

Richard Winter

I just finished reading Into Thin Air, Jonathan Krakauer's compelling account of the disastrous Mt. Everest expeditions in 1996 during which 12 people died in approximately 24 hours. It was an excellent book that is especially significant for those involved in VLDB implementation; as I read it, I could not escape the parallels.
      One of the remarkable aspects of the story is that virtually everyone involved‹guides, paying clients, and professional climbers alike‹seemed to underestimate the danger continually. Yes, there have been many successful climbs of Mt. Everest; by now hundreds of people have made it all the way up and returned safely. But roughly one in four who reach the summit never return.
      Once you get to about 25,000 feet, there is no safety. No one can live for long in the conditions on Mt. Everest above that altitude, even with supplemental oxygen. People suddenly and unpredictably develop altitude sickness and die within hours; people often can't sleep or eat, and they become exhausted quickly; sudden storms are lethal; and no one can think very clearly. Not much has to go wrong to cause a fatality. This region is called "the death zone."
      It seems obvious that, if you are going on an expedition in the Himalayas, you should know about the death zone. You have to know if you are going to enter it, and you have to prepare your expedition differently if you are going to be in the death zone. You need to work out contingency plans well in advance. Even then, you had better know that the risks are much higher above 25,000 feet and you need to be sure its worth it to you and the sponsor of your expedition to go there.
      There is also a VLDB "death zone." Beyond this frontier, the risks are dramatically higher, and projects often fail. When you are building a database that goes significantly beyond prior experience in some important dimension (such as database size, query complexity, transaction volume, or table size) or combination of dimensions (such as size and update or volume and availability), you may well be in the VLDB death zone‹with your career and project every bit as much at risk as your life would be above 25,000 feet on Mt. Everest.
      Of course, your risk would be much higher on Mt. Everest if you didn't even know about the death zone or you knew of it but didn't realize you had just lost your way and wandered into it‹you would be unprepared: no oxygen bottles, no emergency supplies for altitude sickness, perhaps no radios.
      Having worked by now with well over a hundred VLDB implementation teams, I can tell you that it is not at all uncommon for them to wander out beyond the frontier without even realizing that they have done so. And that is a dangerous situation for everyone involved.
      The crux of the danger is this: If you don't know you are beyond the frontier, then you assume your project is about the following question: What is the best way for me to implement this database in order to meet the technical requirements?
      But if you are beyond the VLDB frontier, then your project may really be about this question: Is it feasible to implement this database so as to meet the technical requirements?
      The second question is the central, early question in VLDB projects that go beyond the frontier, because if feasibility is in doubt, you need to be very careful about managing the implementation process. It often makes sense to modify the business process so the technical requirements can be relaxed‹or at least staged over time‹to control the business risk. Other times, it's good business to hang tough on the technical requirements but invest in additional steps‹pilots, benchmarks, or other tests‹to reduce the technical risk.

How to Tell If You Are Beyond the Frontier
What is the frontier? It is an irregular boundary between the known and the unknown in VLDB implementation. When a given class of VLDB problems has been solved several times in practice‹in terms of engineering requirements, complexity, platform architecture, and application domain‹then that class of problems is "known." If you tap into existing experience, proceed professionally, and don't run into Murphy's Law too often, you should succeed. In this type of situation, you need to apply extra care if you think you may be near the frontier, but what you are doing is simply dealing with challenging VLDB problems. You aren't pioneering in the death zone.
      And what do I mean by "several times?" About 10. Let's say you are thinking of building a direct marketing system on Unix, on Oracle on a Sun SMP platform involving about 100GB of data. Let's also suppose you know what your principal business processes, application objectives, technical requirements, and so on are. You know about how this system is going to look and what it is going to have to do.
      Can you find about 10 users who have done something that sounds about the same in terms of all the critical factors: scale, complexity, platform choices, and application? Are they actually in production? Are they meeting requirements that sound similar to yours? Have you actually spoken with at least half of them? Are they credible? Have they actually attained, or surpassed, the objectives you have set for the next year or two? Are they handling a workload similar to yours‹or one that at least that covers the tougher elements of your workload?
      If you can find 10, do a quick validation (look at the key issues) of most of the implementations and complete an in-depth validation on about half of them; if you come out of that feeling as if the other users have already ventured‹in database terms‹where you need to go, then you are in the realm of mainstream VLDB practice. You are probably not facing extraordinary technical risks. You're not near the death zone. You may face a challenging VLDB project, but you shouldn't face a feasibility question; so you can concentrate on my first question.
      Roughly speaking, if you can find six to 10 implementations similar to yours, you are near the frontier. If you can find three to five, you are on the frontier. And if you can't even find three, you are beyond the frontier.

If You Are Near, On, or Beyond the Frontier
If you can't find 10 implementations that sound similar to what you are trying to do, I recommend that you investigate the 10 closest you can find and see how close they actually are. This means asking: Where is the frontier, and how far beyond it are we trying to go?
      If you are anywhere near the frontier, I recommend involving people with "frontier" and "beyond the frontier" experience. This is uncharted territory, where even the vendors don't know exactly what their products can do. Professionals with extensive experience with "normal" sized databases may still be unprepared to deal with the issues they will confront‹just as experience in the Rockies does not, by itself, prepare you for the higher reaches of the Himalayas.
      To paraphrase Yogi Berra, "In VLDB, it ain't been done til it's been done." If you are looking to implement a two-terabyte database marketing system and the platform you are considering has never been used on anything larger than one terabyte, you are beyond the frontier. Do not assume linear scaling. Instead, recognize that you face a significant, unknown engineering factor and that you may have a feasibility question. Focus on the question of feasibility.

Establishing Your Position in Relation to the Frontier
It is essential that you establish your position relative to the frontier with verifiable facts. You must dig a bit below the surface to establish such matters as the size of the database. As I discussed in an earlier column ("How to Avoid the Heartbreak of Teraflation," April 1998) being loose about the meaning of database size can throw you off by a factor of 10 or more, which puts you in the death zone even though you think you are below the tree line.
      Vendors can usually provide you with a list of sites doing something similar to what you plan to do, with contact information. If they can't come up with several that look a lot like yours, that's a tip-off that you are close to the frontier.
      Note that vendors will vary significantly in their positions with respect to any particular area of the frontier. No single vendor will always be in the lead. Thus, vendor A may have much more experience with terabyte data warehouses in health care than vendor B. Conversely, vendor B may be far ahead in large-scale database marketing systems.
      If you are anywhere near the frontier, you must check these sites out independently of the vendor. This is not a matter of trust; it is simply important to your company that the facts be verified directly with other users or independent parties.
      It is also a good idea to seek some information that is independent of vendor sources, such as the Winter VLDB Survey ("Giants Walk the Earth," September 1997). This is updated annually, which is important because the frontier moves rapidly.

ATTACKING THE FEASIBILITY QUESTION

If there is a significant feasibility question in regard to your VLDB requirements, you must make it visible in the implementation process with defined decision points to deal with it. Usually, you want to see the question surfaced and resolved in or before the platform selection process.
      The most common error I see is this one: Even though feasibility is uncertain, users proceed through selection as if their only concern is to find out which of the products under consideration will "score" the best. This can be a fatal error.
      For example, suppose you needed to acquire an airplane to fly across the Pacific Ocean. Even with refueling along the way, such a plane needs to be able to fly thousands of miles over water without stops. But suppose your organization customarily does business only with companies that supply planes aimed at more routine business uses. It does no good to evaluate these planes and select from them the one that has the longest range if that range is only 900 miles. But this is how VLDB evaluations are done all the time. It happens because business requirements aren't always clear; projects fall behind schedule; someone rushes into the selection process; then by the time feasibility questions surface, it seems best just to plow ahead and hope.
      What you need to do instead is, before laying out the selection process, determine whether you are beyond the frontier and if so, you are likely to face a feasibility question. If you do face a feasibility question, plan your acquisition process to deal with it.
      To deal with the feasibility question, you must:

  • Provide for the possibility that none of the platforms under consideration will be able to meet the requirements
  • Insist that those proposing to supply a platform demonstrate that they can meet the requirements (beyond the frontier, this will involve a benchmark or pilot project‹you must measure)
  • Quantify any requirements gap (that is, the difference between your requirements and what the bidders can provide)
  • Provide (in the process) for a pause after the requirements gap has been quantified so you can determine whether to modify business requirements, technical requirements, schedules, and so on
  • If the requirements gap is to be closed by vendor actions, build in contractual commitments, appropriately staged tests, definite schedules, and fallback plans.


      This entire process usually works best if all the stakeholders are made aware in advance of the nature of the questions being confronted. Some IT organizations and integrators operate by shielding their clients from the feasibility questions. It is far more effective, and more professional, to reveal the questions and the process that has been developed for resolving them. This makes the client part of the team that will weigh the risks and resolve the trade-offs between the various approaches to closing the requirements gap