SQL Darwinism: On SQL 
				Server Baselines, Metrics Collection and Trend Analysis 
				By Bill Wunder
				Ernst Mayr will be 100 this year. Professor of 
				Zoology Emeritus at Harvard University, Mayr is sometimes 
				referred to as the Dean of Evolution and has long been a 
				recognized expert in evolutionary biological science. Mayr 
				completed an interesting dissection of Darwin's theories from 
				The Origin of Species  in the 1991 book, One Long 
				Argument. In this book the Professor masterfully describes 
				components of Darwin's evolution. With all due respect to 
				Professor Mayr's scientific knowledge, I'm no biologist so I 
				can't really speak to the nuances of his scientific divergence 
				with Darwin - though in my understanding the differences are 
				significant to the highly trained mind. I do find it 
				particularly insightful to borrow the good Professor's analysis 
				of Darwin's basic premise as a way to understand and explain the 
				change inherent in database applications. With my apologies for 
				any misrepresentations or obfuscations I have taken the liberty 
				of doing a little word substitution into Mayr's itemized account 
				of Darwin's evolutionary theory as it might describe a typical 
				database application on this, the threshold of Mayr's second 
				century. I have also added a few software technology keywords to 
				try to help you conjure up an image of how the aspects of Mayr's 
				critique of Darwinism offers a plausible analytical value in the 
				world of database applications: 
				
					- Evolution as 
					such. This is the theory that the data application is 
					not constant or recently created nor perpetually cycling, 
					but rather is steadily changing, and that data and structure 
					are transformed in time. keywords: software lifecycle, 
					iterative development, service packs, patches, bugs, scope 
					creep 
 
					- Common descent. 
					This is the theory that every set descended from a common 
					object, and that all groups of objects, including methods, 
					constants, and even relational tables, ultimately go back to 
					a single binary origin. keywords: the SQL Server 2000 model 
					database, data modeling, reusability, OOP, the wheel 
					(obscure), hierarchical, relational, star schema 
					
 
					- Multiplication 
					of species. This theory explains the origin of the 
					enormous data diversity. It postulates that data multiply, 
					either by splitting into daughter sets or by "budding", that 
					is, by the establishment of application specific isolated 
					founder populations that evolve into new sets. keywords: 
					project management, normalization, refactoring, 
					backup/restore, CREATE DATABASE, EDI, XML 
					
 
					- Gradualism. 
					According to this theory, evolutionary change takes place 
					through the gradual change of set populations and not by the 
					sudden (saltational) production of new individuals that 
					represent a new type. keywords: insert, update, delete, 
					legacy components, backward compatibility, change control, 
					versioning 
 
					- Natural selection. 
					According to this theory, evolutionary change comes about 
					through the abundant production of attribute variation in 
					every generation. The relatively few individuals who 
					survive, owing to a particularly well-adapted combination of 
					inheritable characters, give rise to the next generation. 
					(keywords  delete/archive strategy, outsourcing, Internet 
					bust, dBase, MVS, Microsoft, Oracle, MySQL)
 
				 
				
				Hopefully you found the ease with which 
				biological evolution and database technology can be similarly 
				described palatable and at least a little amusing. At another 
				level I believe it is also similarly interesting to explore the 
				advantages Mayr has at his disposal in critiquing Darwin's 
				postulations. From the time Sir Charles sailed around South 
				America to the time Ernst Mayr spent his scholarly career 
				exploring the validity of Darwin's pioneering work a tremendous 
				amount of data had been accumulated. Observations and essays to 
				compare and contrast to the original ideas are now plentiful and 
				represent a huge spectrum of empirical thought. Sure, Darwin had 
				some guidance and others before him had offered bits and pieces 
				of information that he could build upon, but Mayr had much more. 
				He had a clearly defined starting point for the definition of 
				biological evolution and he had a rather extended trend of 
				scientific thought upon which to build a clearer, more accurate 
				description of biological evolution. Simply put, Mayr had a 
				baseline, a collection of data points from which to clearly 
				state how well the original theory performed in the moment and 
				even to predict how the theory might change in the future, and 
				most importantly he followed a sound methodology to reach sound 
				conclusions.  
				Likewise, in order to gauge the performance of 
				original specifications for a data application - and we all know 
				how theoretical those original specifications can be - one needs 
				first to establish baselines. In too many cases the baseline is 
				left anecdotal with the result that proactive and predictive 
				analysis is pre-empted by crisis and the path forward is left 
				open to less than well thought out adjustments to the original 
				theory. Just consider how far down a narrow path Darwinism 
				progressed before it received the necessary analysis and 
				adjustments to extract the facts from the fictions - to wit 
				Herbert Spencer's 
				
				Social Darwinism 
				hogwash. And not until late in the 20th century were adequate 
				corrections advanced for those cogent errors of the late 19th 
				century.  So it is with database components of an application 
				that a baseline coupled with continual analysis and adjustment 
				is key to assuring continual and accurate suitability of the 
				hardware and software components to the application. 
				To say that in a different way, there are 
				really two general approaches used to determine when to upgrade 
				or re-architect a SQL Server. I'm not saying that there are only 
				two ways to upgrade. Rather the point I want to make - and 
				expound upon at some length in my next article by the way - is 
				that what causes you to decide to upgrade or fundamentally alter 
				an application database server is either a measured response in 
				a systems evolution or a reaction to a crisis or assumption at a 
				critical point. A SQL Server critical point occurs when hardware 
				or design limitations result in unsatisfactory performance or 
				unacceptable availability. Perhaps the total CPU hovers between 
				80 and 100% independent of a single run-away process or disk 
				queuing on your storage system results in general customer 
				dissatisfaction in application response time or you run out of 
				tricks to prevent your log device from filling with a resultant 
				interruption of production processing on a regular basis, or you 
				flat run out of disk space for your data set or processor 
				bandwidth for the application load. Conversely, the evolution of 
				a database application is generally gradual and to a large 
				extent predictable. It is common to measure web hits trend up or 
				down or the count of new customers to increase by n number a 
				month. The challenge - from the database capacity and 
				performance planning perspective - is how to equate such high 
				level application growth rates with viable database server load 
				expectations. When well done, a SQL Server critical point can be 
				avoided while at the same time the factual data can be leveraged 
				to result in an efficient and effective upgrade path rather than 
				a potential for SQL Darwinism. 
				Allow me to facetiously illustrate where such 
				SQL Darwinism might take an application. First, in keeping to an 
				example consistent with Spencer's imperative that it is our 
				moral duty to allow the weak to fail undeterred, lets say that 
				SQL Darwinism holds that an application be slated for a new 
				platform "when it outgrows" SQL Server (bet you never heard that 
				one before).  Then it would follow that only ANSI SQL be used in 
				that application in spite of the well known truth that ANSI SQL 
				does not assure an easy upgrade path to or from any existing 
				database product. And it could even be that a follower of this 
				SQL Darwinism would be resistant to remedial changes when a 
				growing application demonstrates barely adequate performance on 
				an earlier and/or aging version of a database product, OS, or 
				hardware platform - since implicit in this Darwinism is the 
				expectation that the application will outgrow the platform 
				regardless. So finally the day comes when a poor indexing scheme 
				causes the SQL Server to no longer keep up with the application 
				needs. Fortunately, and predictably in the grandest design of 
				SQL Darwinism, there is almost always an unscrupulous entity - 
				lets call ours "N Ron Consultants" - waiting in the wings to 
				move the struggling application to a new platform at some hefty 
				fee with high associated costs and, if the Darwinist is at all 
				lucky, quietly slip in the necessary indexing corrections as 
				they rewrite the ANSI SQL for the new platform.  
				Maybe that little story is a bit far fetched, 
				maybe not. What is most definitely not far fetched is that it's 
				not that difficult to establish baselines and watch the trends 
				on a database server as part of a methodical approach to keeping 
				a successful application successful. In my next article I'm 
				going to explain how you can make use of the SQL Server 2000 
				sysperfinfo system table and the global statistics available 
				within SQL Server to easily and automatically establish 
				baselines on your servers and to continuously and automatically 
				track changes from those baselines to help you avoid the 
				pitfalls of SQL Darwinism - even if your application is 
				successful.    
				Bill 
				  
				 |