A heuristic query optimizer must choose the best way to process an incoming query. This choice is based on comparing the expected cost of many (or all) of the ways that a command might be processed. This expected cost calculation is determined by statistics on the sizes of the relations involved and the selectivities of the operations being performed. Of course, such estimates are subject to error, and in this paper we investigate the sensitivity of the best query plan to errors in the selectivity estimates. We treat the common case of join queries and show that the optimal plan for most queries is very insensitive to selectivity inaccuracies. Hence, there is little reason for a data manager to spend a lot of effort making accurate estimates of join selectivities.
All Science Journal Classification (ASJC) codes
- Information Systems