ORA-01555错误处理一例

[复制链接]
查看11 | 回复5 | 2014-5-16 14:12:02 | 显示全部楼层 |阅读模式
近来生产系统反馈,时常有作业被异常中止,导致应用程序被hung住。经过检查日志,是某个作业在运行时发生了1555错误,导致程序无法返回结果: * ERROR at line 1:
ORA-01555: snapshot too old: rollback segment number 9 with name "RBS08" too small
ORA-06512: at "SQLUSER.EXT_PKG", line 4917
ORA-06512: at line 1

相关程序记录下的日志:STAT-S.20060313185536.lg = 2 processed =20060313185536 end date Error = ORA-01555: snapshot too old: rollback segment number 9 with name "RBS08" too small BEGIN ext_pkg.main('extfiles','50','EAT'); END;
错误分析解决这是一个典型的1555错误。检查引发该错误的PACKAGE,发现它只有一个入口函数main(及程序日志中记录的函数),但这个函数还调用了其他N个PACKAGE里面的函数。这是一个大作业,执行时,设置它使用了一个大的回滚段:RBS_BATCH1。先看看相关配置:rollback tablespace空间为8G,undo_retention为1800。看看回滚段的统计数据:SQL> Select rn.Name "Rollback Segment", rs.RSSize/1024 "Size (KB)", rs.Gets "Gets",2 rs.waits "Waits", (rs.Waits/rs.Gets)*100 "% Waits",3 rs.Shrinks "# Shrinks", rs.Extends "# Extends"4from sys.v_$RollName rn, sys.v_$RollStat rs5wherern.usn = rs.usn; Rollback Segment
Size (KB) GetsWaits% Waits# Shrinks# Extends




------------------------------ ---------- ---------- ---------- ---------- ---------- ----------




SYSTEM
952 1189
0
0
0
0




RBS_BATCH1
409592 681422667 .097883544
0
0




... ...S_BATCH1 getshould be provided by DEV Team.r.and CSS_EMAN_INDX exist.eir RBS10
204792 478502 10 .002089855
0
0




RBS11
204792 477366
8 .001675863
0
0




RBS12
204792 491070
6 .001221822
0
0




RBS_BATCH2
409592 650088644 .099063511
0
0




RBS_BATCH3
409592 243849
3.00123027
0
0




16 rows selected. 注意到RBS_BATCHT1的wait%是0.098%,这个值应该是比较好的一个值。 回过头再来看依法错误的语句:调用ext_pkg.main函数。在程序日志中已经记录下了输入参数,这就比较好办了:作一个trace,看看到底哪条语句的性能最差:SQL> alter system set events ‘10046 trace name context forever, level 1’; System altered. SQL> exec ext_pkg.main('extfiles','50','EAT'); PL/SQL procedure successfully completed. SQL> alter system set events ‘10046 trace name context off’; System altered.用tkprof处理trace文件后检查trace文件:TKPROF: Release 9.2.0.5.0 - Production on Tue Mar 14 09:21:58 2006 Copyright (c) 1982, 2002, Oracle Corporation.All rights reserved. Trace file: prod_ora_29225.trcSort options: default ********************************************************************************count= number of times OCI procedure was executedcpu= cpu time in seconds executingelapsed= elapsed time in seconds executingdisk = number of physical reads of buffers from diskquery= number of buffers gotten for consistent readcurrent= number of buffers gotten in current mode (usually for update)rows = number of rows processed by the fetch or execute call******************************************************************************** alter session set events='10046 trace name context forever, level 1' ... ... BEGINext_pkg.main('extfiles','50','EAT');END; call count cpuelapsed diskquerycurrentrows------- -------------- ---------- ---------- ---------- --------------------Parse10.01 0.04
4 39
0 0Execute10.20 0.212456037808
0 1Fetch00.00 0.00
0
0
0 0------- -------------- ---------- ---------- ---------- --------------------total20.21 0.262456437847
0 1 Misses in library cache during parse: 1Optimizer goal: CHOOSEParsing user id: 30******************************************************************************** ... ...SELECT QIDFROM CSQUE QUE, ASTP STP WHEREQUE.CDE IN (:B3, :B2) AND QUE.TID =STP.TID AND STP.ACDE = :B1 ORDER BYQUE.CDE, QUE.DTcall count cpuelapsed diskquerycurrentrows------- -------------- ---------- ---------- ---------- --------------------Parse10.00 0.00
0
0
0 0Execute20.00 0.00
0
0
0 0Fetch 26820.41 0.54 755710968
02680------- -------------- ---------- ---------- ---------- --------------------total 26850.41 0.54 755710968
02680 Misses in library cache during parse: 1Optimizer goal: CHOOSEParsing user id: 31 (recursive depth: 1)******************************************************************************** ... ...7userSQL statements in session. 41internal SQL statements in session. 48SQL statements in session.********************************************************************************Trace file: prod_ora_29225.trcTrace file compatibility: 9.02.00Sort options: default1session in tracefile. 7userSQL statements in trace file.41internal SQL statements in trace file.48SQL statements in trace file.20unique SQL statements in trace file.8626lines in trace file.从trace文件中,发现有一条语句性能相当差,通过对这条语句做SQL Trace,发现它的consistent gets达到80万!于是对该语句进行优化,调整了它的写法,并建立了缺少的索引(优化过程略)。最终将consistent gets数量降低到了5000。重新安排上线,经过一周的观察,1555错误没再发生。其实这个案例的解决是比较简单的,最终的处理就是将一条语句进行优化。
回复

使用道具 举报

千问 | 2014-5-16 14:12:02 | 显示全部楼层
好贴啊~~lz能整个pdf吗?
回复

使用道具 举报

千问 | 2014-5-16 14:12:02 | 显示全部楼层
xb0307052 发表于 2012-5-14 14:04
好贴啊~~lz能整个pdf吗?

我整理下吧,发出来一看帖子有点乱,不好意思
回复

使用道具 举报

千问 | 2014-5-16 14:12:02 | 显示全部楼层
gangwendi 发表于 2012-5-14 14:16
我整理下吧,发出来一看帖子有点乱,不好意思


 多谢~~
回复

使用道具 举报

千问 | 2014-5-16 14:12:02 | 显示全部楼层
很详细 不错
回复

使用道具 举报

千问 | 2014-5-16 14:12:02 | 显示全部楼层
好文。
回复

使用道具 举报

您需要登录后才可以回帖 登录 | 立即注册

本版积分规则

主题

0

回帖

4882万

积分

论坛元老

Rank: 8Rank: 8

积分
48824836
热门排行