|
|
Intelligent Enterprise Testing
the Terabytes
But, guess what? As well as thrilling us, these huge BI systems had better work. You absolutely must test early and on a large scale. The idea of huge BI systems that simply have to work, of course, leads to the Richard Winter theme song: When it comes to database scalability, the bigger it is, the more you have to test it and before you get too far downstream. That is, you cant wait until youve bought all the equipment, built the applications, and created the production database before testing. But early large-scale testing is the very issue that causes
most people to throw up their hands. After all, how can you possibly
create a 1TB database (or worse, a 10TB one) early in the design
process? Its difficult to assemble the equipment at this
point in the project. Its difficult to free up the resources
to design, implement, and evaluate large-scale tests. Its
also difficult to prevent other project activities from disrupting
a large-scale test process. And for many users, its difficult
to sort out what to test. Traditional Testing OptionsThe top vendors of very large databases (VLDBs) do regularly assist their clients with large-scale proof-of-concept as part of their professional service offerings. We spoke with Compaq Computer Corp., Microsoft, Hewlett-Packard, NCR, Oracle, and Sun Microsystems to get an idea of what they offer. NCR described a proof-of-concept approach used for a long
time at its benchmark centers and at customer sites. Oracle generally
conducts large proof-of-concept projects at the customer site.
Most vendors described lab facilities or benchmark centers that
are occasionally pressed into service to do a more extensive
customer proof-of-concept. A New OptionWhen we have been involved, both NCR and Oracle have done a fine job, and NCRs experience with extreme requirements has shown through. However, something no vendor has done until recently goes another step beyond. IBM has created a group of test and integration centers intended to help customers who need to complete a large-scale BI proof-of-concept. Called the Teraplex Integration Centers, these are permanent, large-scale testing laboratories with dedicated equipment and personnel, open for customers and business partners to use. There is a Teraplex Center for each of IBMs major BI platforms: RS/6000, OS/390, AS/400, and Netfinity (for Windows NT). They are staffed with dedicated, interdisciplinary teams of IBM personnel who can cover issues in hardware, system software, database software, and so on. Each center has terabytes of disk and large complements of hardware and software in place. In addition, each center has the ability to obtain additional hardware, software, and people to support specific projects. The idea behind these centers is to provide a laboratory where
real-world integration and testing can take place. Thus, customers
ordinarily bring real data. A typical test reproduces all the
critical elements of the customers environment or
planned environment. Thus, when customers use independent software
vendors tools, utilities, or applications, they install
them at the center under a temporary license. As a result, there
have been Teraplex projects involving Oracle, Informix, and other
database engines. The point is that BI solutions are ordinarily
created with the products of multiple suppliers, and a critical
element of the challenge is determining the performance and scalability
of the resultant integrated system. Not Benchmark CentersA noteworthy point about the Teraplex Centers is that theyre entirely separate from IBMs benchmarking centers. IBM uses the benchmarking centers principally for running industry-standard benchmarks (such as TPC) and for competitive measurements that are part of the sales cycle. The Teraplex Centers, however, are more for client-defined
feasibility studies and proof-of-concept exercises. These studies
and exercises are focused on integration, performance, and scalability
in relation to a specific business problem, where the client
is intent on implementing a solution. The experience is more
realistic, in the sense that there seldom are artificial rules
or artificial deadlines; there is no contest. Case Study: AetnaAetna U.S. Healthcare, which manages 13 million medical policies, came to the RS/6000 Teraplex Center to test the stability, performance, and scalability of its data warehouse with a multiuser workload under a new version of DB2. Aetna expected its warehouse to grow from 200GB to more than 1TB in the first year, and subsequently, up to 2 to 3TB. The company needed to ensure that the technology could deliver on a database that large. The objectives of Aetnas Teraplex project were: Test the performance and scalability of DB2 Universal Database in a large, multiuser, data warehouse environment using real data and up to 33 concurrent queries. Test the stability of DB2 Universal Database. Aetna would be one of the first customers to put it into production for a data warehouse expected to exceed 1TB. Aetna used its own data, delivering 100GB to the Teraplex, from which IBM created the two larger databases of 500GB and 1TB. The largest table in the 1TB database was over 400GB, with 935 million rows. All sensitive information was encrypted for security and patient confidentiality. Aetna also provided several complex queries many of which required joins involving between 11 and 17 tables, and represented a star or snowflake pattern. The tests ran during a three-month period. Aetna was able to meet its performance requirements on the
1TB database when running up to 33 concurrent, extremely complex
queries, verifying the viability of its hardware and software
architecture for future business needs. The testing also enabled
Aetna to put its data warehouse into operation a few months sooner
than it would have otherwise. More is BetterWe would like to see more full-scale test and integration centers in the industry. Not only would it help the customers, which indirectly helps the vendor, but it directly aids the vendor as well. The VLDB producer providing such a service gains the opportunity to work hands-on with a new product running against a realistic, large-scale, customer workload in an environment in which it can be fully metered, analyzed, and subjected to experimentation. The experience of participating in one of these projects feels different from that of running a typical benchmark, and it leaves you with a remarkable sense of its distinctive value. The combination of dedicated centers, dedicated staff, professional management, institutionalized executive support from the highest levels of the organization, and a systematic process results in a palpable difference. Also, hardware can be reconfigured much more quickly at the center than is possible at a customer site. At the same time, this Teraplex idea allows you to have the process last longer and take on a more exploratory character than what typically happens in a benchmark center. Because there is no pressure to win a contest, there is an opportunity to investigate options more thoroughly, furthering the process of meeting your demands. |