selectivity之疑问

显示全部楼层 · 2009-1-4 14:52:28

在计算selectivity 中，有如下描述：
We have been using the basic formula for the selectivity of a range-based predicate, and know that this varies slightly depending on whether you have zero, one, or two closed (meaning the equality in=) ends to your range. UsingNas the number of closed ends, the version of the formula that usesuser_tab_columns.num_distinctcan be written as
(required range) / (column high value - column low value) + N / num_distinct
For example, in a column with 1,000 different (integer) values ranging from 1 to 1000, and the pair of predicatescolX > 10andcolX =1 这是就有两个closed end；N 就取2.
而在(required range) / (column high value - column low value) 之后，又加了个N / num_distinct，显然是在优化器没有柱状图统计信息时，假设了数据的均匀分布；但如果假设了数据的均匀分布前面又为什么没有使用 requied range/num_distinct 呢？
是我的推理出现了什么问题吗？

千问 · 2009-1-4 14:52:28

有些时候，oracle 不能自己选择绝对正确的执行计划。这时就需要我们手工加hint 解决。但是在实际的生产环境中，或许不允许改代码，我们就不得不手工修改统计数据了
这个问题的讨论旨在了解优化器的内部机制弄清楚了这个问题修改起来统计数据就可以得心应手了。还望各位朋友能积极加入讨论。

千问 · 2009-1-4 14:52:28

#4 问题可以通过假设数据分布如： 0.5，0.75，0.75，0.8，0.8，1，2，3，3 等情况来理解；
这个时候，high - low << distinct_num;
这里仅说了个思路，具体证明有兴趣的朋友可以自己证明；