Prani: The Importance of Histograms

On one occasion, I experienced a query that was running against a large table whose indexed columns did not have normally distributed data. A particular query on this table was taking over 18 hours to parse, execute, and fetch its rows. After we realized that the data, instead of being normally distributed between a low and high value, had spikes of high and low values throughout, we analyzed the table using the following options:

SQL> ANALYZE TABLE ABC COMPUTE STATISTICS FOR COLUMNS award
SIZE 100;

This command created a histogram that divided the data in the AWARD column of the ABC table into 100 separate slices. The histogram then determined the high and low value for each of the slices and stored those as part of the table and index statistics. These statistics gave the optimizer a much more accurate view of the data and the usefulness of the associated
indexes.

As a result, the same query that had taken 18 hours to complete now ran in only three minutes with no other changes. This clearly demonstrates the importance of using histograms when the column data being accessed is not normally distributed. Histogram information can be found in the DBA_HISTOGRAMS data dictionary view.

Prani

Sunday, May 11, 2008

The Importance of Histograms

No comments: