Last_value问题(学习分析函数之first_value和last_value)

显示全部楼层 · 2013-2-25 14:51:24

本帖最后由瘾上你了于 2012-2-26 17:07 编辑
一：代码如下：
建表：
create table test(
id number,
mc number,
sl number
)
插数：
insert into test values(2,110,1);
insert into test values(1,111,1);
insert into test values(2,555,2);
insert into test values(1,222,1);
insert into test values(2,333,2);
insert into test values(1,666,3);
insert into test values(1,333,2);
insert into test values(2,222,1);
insert into test values(1,555,3);
commit;
查表：
select * from test；
idmc sl
2 110 1
1 111 1
2 555 2
1 222 1
2 333 2
1 666 3
1 333 2
2 222 1
1 555 3
二：问题来了：
关于以下SQL语句：
select id,mc,sl,
first_value(mc) over (partition by id order by sl) fv,
last_value(mc) over (partition by id order by sl) ls
from test；
得到的结果是：
idmc slfvls
1 111 1 111 222
1 222 1 111 222
1 333 2 111 333
1 555 3 111 666
1 666 3 111 666
2 110 1 110 222
2 222 1 110 222
2 333 2 110 555
2 555 2 110 555
为什么得到的ls列是上面这样的，为什么不是下面这样的？
idmc slfvls
1 111 1 111 111
1 222 1 111 222
1 333 2 111 333
1 555 3 111 555
1 666 3 111 666
2 110 1 110 110
2 222 1 110 222
2 333 2 110 333
2 555 2 110 555
我知道last_value默认的windowing方式是current row 到 unbounded preceeding. 问题就在于现在同一个分区里面order by的列名的值一样的，为什么last_value是取的相同order by 的列名值的最后一个值呢？比如order by sl 的前面两行sl 的值都是1，为什么第一行的last_value函数的值不取current row（第一行的当前行就是第一行）的mc的值而取第二行的呢？第2行的值取第2行这是理所当然。请大家帮我解答下，非常感谢。
二：下面是关于first_value 和 last_value 分析函数的总结，不是问题，算是我的个人笔记吧。
摘自：askTom
first_Value(column_name) over (partition by table_name order by column_id) a,
that will take the data, break it up by table_name
then, within each TABLE_NAME, sort by column_id
then, return the FIRST_VALUE of column name in the window which is by default (since you didn't say otherwise) going from the CURRENT ROW to UNBOUNDED PRECEDING.
So, the first_value is the first row in the window - always.

last_value(column_name) over (partition by table_name order by column_id desc) b from all_tab_columns

that will
a) break up by table_name
b) sort by column_id
c) return last row from current window - which by default is the CURRENT ROW and UNBOUNDED PRECEDING....
The last row of the window is always changing - as you progress through the partition.

you either want:
a) the first_value after sorting by column_id - wait you already have that.
b) the last_value with rows between current row and unbounded following.
So like this:
1 select table_name,
2column_id,
3column_name,
4first_Value(column_name) over (partition by table_name order by column_id) a,
5last_value(column_name) over (partition by table_name order by column_id desc
rows between current row and unbounded following) b
6 from all_tab_columns
7where table_name='ALL_ALL_TABLES'
In this way,both the value of column a and b are the same.

千问 · 2013-2-25 14:51:24

不光这个函数，只要带order by的不带window的都是默认是range unbounded preceding and current row，也就是逻辑窗口。逻辑窗口可以保证结果的唯一性，current row不是物理当前行，是逻辑当前行，所以排序值有重复的，把所有的重复行都包含进去了

千问 · 2013-2-25 14:51:24

好问题，跟着加深一下over窗口的理解

千问 · 2013-2-25 14:51:24

为了排除重复，常用的技巧是在ORDER BY最后带上ROWID或者主键。

千问 · 2013-2-25 14:51:24

newkid 发表于 2012-2-27 06:11
为了排除重复，常用的技巧是在ORDER BY最后带上ROWID或者主键。
谢谢斑竹。
能给我个 order by + rowid 的例子吗？

千问 · 2013-2-25 14:51:24

本帖最后由 bell6248 于 2012-2-27 11:27 编辑
瘾上你了发表于 2012-2-27 11:19
谢谢斑竹。
能给我个 order by + rowid 的例子吗？
LZ, 分析函数默认的是按逻辑划分的，具体看看相关Oracle文档即可！
你要的结果只要加点很小的改动即可，我提供了2种方法及ls_right_method1和ls_right_method2列，具体测试如下：
SQL> select * from test;
ID MC SL
---------- ---------- ----------
2110
1
1111
1
2555
2
1222
1
2333
2
1666
3
1333
2
2222
1
1555
3
9 rows selected
SQL>
SQL> select id,
2 mc,
3 sl,
4 first_value(mc) over(partition by id order by sl) fv,
5 last_value(mc) over(partition by id order by sl, rowid) ls_right_method1,
6 last_value(mc) over(partition by id order by sl rows between unbounded preceding and current row) ls_right_method2
7from test;
ID MC SL FV LS_RIGHT_METHOD1 LS_RIGHT_METHOD2
---------- ---------- ---------- ---------- ---------------- ----------------
1111
1111
111
111
1222
1111
222
222
1333
2111
333
333
1555
3111
555
555
1666
3111
666
666
2110
1110
110
110
2222
1110
222
222
2333
2110
333
333
2555
2110
555
555
9 rows selected
SQL>

千问 · 2013-2-25 14:51:24

本帖最后由瘾上你了于 2012-2-27 12:50 编辑
谢谢几位版主，我现在是一边看Oracle的官方文档一边对着精华贴学的。根据你们的回答，我总结下：
1.Oracle 分析函数默认窗口是：range between unbounded preceding and current row，这也就是逻辑窗口。unbounded preceding总是当前窗口中的第1行，current row也是逻辑的当前行，会包含重复行。
2.last_value 排除重复行的方法：
a.order by primary key 或者 unique key
b. order by 后面加上 rowid. 如：
last_value(mc) over(partition by id order by sl, rowid) ls_right_method1

c.使用 rows between unbounded preceding and current row. 这里的current row 就是物理上的当前行。如：
last_value(mc) over(partition by id order by sl rows between unbounded preceding and current row) ls_right_method2

千问 · 2013-2-25 14:51:24

看了帖子之
first_value(mc) over(partition by id order by sl) 相当于 id和sl分组
SQL> select
2id,
3mc,
4sl,
5first_value(mc) over(partition by id order by sl) as fv,
6last_value(mc) over(partition by id order by sl rows between unbounded preceding and unboun
ded following) as lv
7
from tfl;
ID MC SL FV LV
---------- ---------- ---------- ---------- ----------
1111
1111666
1222
1111666
1333
2111666
1555
3111666
1666
3111666
2110
1110555
2222
1110555
2333
2110555
2555
2110555
已选择9行。
这样才是根据id分组的第一行和最后一行。
学习了