|
|
WinterCorp
245 First
Street
Suite 1800
Cambridge, MA 02142
Phone:
617-695-1800
Fax: 617-848-3795
Contact us
| Privacy
|
|
|
|
VLDB Vision
June 1998
Assessing the Feasibility
of a
VLDB Solution
Is your project possible?
Richard Winter
I just finished reading Into Thin Air, Jonathan Krakauer's
compelling account of the disastrous Mt. Everest expeditions
in 1996 during which 12 people died in approximately 24 hours.
It was an excellent book that is especially significant for those
involved in VLDB implementation; as I read it, I could not escape
the parallels.
One of the remarkable aspects
of the story is that virtually everyone involvedguides,
paying clients, and professional climbers alikeseemed to
underestimate the danger continually. Yes, there have been many
successful climbs of Mt. Everest; by now hundreds of people have
made it all the way up and returned safely. But roughly one in
four who reach the summit never return.
Once you get to about 25,000
feet, there is no safety. No one can live for long in the conditions
on Mt. Everest above that altitude, even with supplemental oxygen.
People suddenly and unpredictably develop altitude sickness and
die within hours; people often can't sleep or eat, and they become
exhausted quickly; sudden storms are lethal; and no one can think
very clearly. Not much has to go wrong to cause a fatality. This
region is called "the death zone."
It seems obvious that, if
you are going on an expedition in the Himalayas, you should know
about the death zone. You have to know if you are going to enter
it, and you have to prepare your expedition differently if you
are going to be in the death zone. You need to work out contingency
plans well in advance. Even then, you had better know that the
risks are much higher above 25,000 feet and you need to be sure
its worth it to you and the sponsor of your expedition to go
there.
There is also a VLDB "death
zone." Beyond this frontier, the risks are dramatically
higher, and projects often fail. When you are building a database
that goes significantly beyond prior experience in some important
dimension (such as database size, query complexity, transaction
volume, or table size) or combination of dimensions (such as
size and update or volume and availability), you may well be
in the VLDB death zonewith your career and project every
bit as much at risk as your life would be above 25,000 feet on
Mt. Everest.
Of course, your risk would
be much higher on Mt. Everest if you didn't even know about the
death zone or you knew of it but didn't realize you had just
lost your way and wandered into ityou would be unprepared:
no oxygen bottles, no emergency supplies for altitude sickness,
perhaps no radios.
Having worked by now with
well over a hundred VLDB implementation teams, I can tell you
that it is not at all uncommon for them to wander out beyond
the frontier without even realizing that they have done so. And
that is a dangerous situation for everyone involved.
The crux of the danger is
this: If you don't know you are beyond the frontier, then you
assume your project is about the following question: What is
the best way for me to implement this database in order to meet
the technical requirements?
But if you are beyond the
VLDB frontier, then your project may really be about this question:
Is it feasible to implement this database so as to meet the technical
requirements?
The second question is the
central, early question in VLDB projects that go beyond the frontier,
because if feasibility is in doubt, you need to be very careful
about managing the implementation process. It often makes sense
to modify the business process so the technical requirements
can be relaxedor at least staged over timeto control
the business risk. Other times, it's good business to hang tough
on the technical requirements but invest in additional stepspilots,
benchmarks, or other teststo reduce the technical risk.
How to Tell If You Are Beyond the
Frontier
What is the frontier? It is an irregular boundary between the
known and the unknown in VLDB implementation. When a given class
of VLDB problems has been solved several times in practicein
terms of engineering requirements, complexity, platform architecture,
and application domainthen that class of problems is "known."
If you tap into existing experience, proceed professionally,
and don't run into Murphy's Law too often, you should succeed.
In this type of situation, you need to apply extra care if you
think you may be near the frontier, but what you are doing is
simply dealing with challenging VLDB problems. You aren't pioneering
in the death zone.
And what do I mean by "several
times?" About 10. Let's say you are thinking of building
a direct marketing system on Unix, on Oracle on a Sun SMP platform
involving about 100GB of data. Let's also suppose you know what
your principal business processes, application objectives, technical
requirements, and so on are. You know about how this system is
going to look and what it is going to have to do.
Can you find about 10 users
who have done something that sounds about the same in terms of
all the critical factors: scale, complexity, platform choices,
and application? Are they actually in production? Are they meeting
requirements that sound similar to yours? Have you actually spoken
with at least half of them? Are they credible? Have they actually
attained, or surpassed, the objectives you have set for the next
year or two? Are they handling a workload similar to yoursor
one that at least that covers the tougher elements of your workload?
If you can find 10, do a
quick validation (look at the key issues) of most of the implementations
and complete an in-depth validation on about half of them; if
you come out of that feeling as if the other users have already
venturedin database termswhere you need to go, then
you are in the realm of mainstream VLDB practice. You are probably
not facing extraordinary technical risks. You're not near the
death zone. You may face a challenging VLDB project, but you
shouldn't face a feasibility question; so you can concentrate
on my first question.
Roughly speaking, if you
can find six to 10 implementations similar to yours, you are
near the frontier. If you can find three to five, you are on
the frontier. And if you can't even find three, you are beyond
the frontier.
If You Are Near, On, or Beyond the
Frontier
If you can't find 10 implementations that sound similar to what
you are trying to do, I recommend that you investigate the 10
closest you can find and see how close they actually are. This
means asking: Where is the frontier, and how far beyond it are
we trying to go?
If you are anywhere near
the frontier, I recommend involving people with "frontier"
and "beyond the frontier" experience. This is uncharted
territory, where even the vendors don't know exactly what their
products can do. Professionals with extensive experience with
"normal" sized databases may still be unprepared to
deal with the issues they will confrontjust as experience
in the Rockies does not, by itself, prepare you for the higher
reaches of the Himalayas.
To paraphrase Yogi Berra,
"In VLDB, it ain't been done til it's been done." If
you are looking to implement a two-terabyte database marketing
system and the platform you are considering has never been used
on anything larger than one terabyte, you are beyond the frontier.
Do not assume linear scaling. Instead, recognize that you face
a significant, unknown engineering factor and that you may have
a feasibility question. Focus on the question of feasibility.
Establishing Your Position in Relation
to the Frontier
It is essential that you establish your position relative to
the frontier with verifiable facts. You must dig a bit below
the surface to establish such matters as the size of the database.
As I discussed in an earlier column ("How
to Avoid the Heartbreak of Teraflation," April 1998)
being loose about the meaning of database size can throw you
off by a factor of 10 or more, which puts you in the death zone
even though you think you are below the tree line.
Vendors can usually provide
you with a list of sites doing something similar to what you
plan to do, with contact information. If they can't come up with
several that look a lot like yours, that's a tip-off that you
are close to the frontier.
Note that vendors will vary
significantly in their positions with respect to any particular
area of the frontier. No single vendor will always be in the
lead. Thus, vendor A may have much more experience with terabyte
data warehouses in health care than vendor B. Conversely, vendor
B may be far ahead in large-scale database marketing systems.
If you are anywhere near
the frontier, you must check these sites out independently of
the vendor. This is not a matter of trust; it is simply important
to your company that the facts be verified directly with other
users or independent parties.
It is also a good idea to
seek some information that is independent of vendor sources,
such as the Winter VLDB Survey ("Giants Walk the Earth,"
September 1997). This is updated annually, which is important
because the frontier moves rapidly.
ATTACKING THE FEASIBILITY QUESTION
If there is a significant feasibility question in regard to
your VLDB requirements, you must make it visible in the implementation
process with defined decision points to deal with it. Usually,
you want to see the question surfaced and resolved in or before
the platform selection process.
The most common error I see
is this one: Even though feasibility is uncertain, users proceed
through selection as if their only concern is to find out which
of the products under consideration will "score" the
best. This can be a fatal error.
For example, suppose you
needed to acquire an airplane to fly across the Pacific Ocean.
Even with refueling along the way, such a plane needs to be able
to fly thousands of miles over water without stops. But suppose
your organization customarily does business only with companies
that supply planes aimed at more routine business uses. It does
no good to evaluate these planes and select from them the one
that has the longest range if that range is only 900 miles. But
this is how VLDB evaluations are done all the time. It happens
because business requirements aren't always clear; projects fall
behind schedule; someone rushes into the selection process; then
by the time feasibility questions surface, it seems best just
to plow ahead and hope.
What you need to do instead
is, before laying out the selection process, determine whether
you are beyond the frontier and if so, you are likely to face
a feasibility question. If you do face a feasibility question,
plan your acquisition process to deal with it.
To deal with the feasibility
question, you must:
- Provide for the possibility that none of the platforms under
consideration will be able to meet the requirements
- Insist that those proposing to supply a platform demonstrate
that they can meet the requirements (beyond the frontier, this
will involve a benchmark or pilot projectyou must measure)
- Quantify any requirements gap (that is, the difference between
your requirements and what the bidders can provide)
- Provide (in the process) for a pause after the requirements
gap has been quantified so you can determine whether to modify
business requirements, technical requirements, schedules, and
so on
- If the requirements gap is to be closed by vendor actions,
build in contractual commitments, appropriately staged tests,
definite schedules, and fallback plans.
This entire process usually
works best if all the stakeholders are made aware in advance
of the nature of the questions being confronted. Some IT organizations
and integrators operate by shielding their clients from the feasibility
questions. It is far more effective, and more professional, to
reveal the questions and the process that has been developed
for resolving them. This makes the client part of the team that
will weigh the risks and resolve the trade-offs between the various
approaches to closing the requirements gap
|