SUSE LINUX 下SCSI 与 ORACLE 出现的问题

[复制链接]
查看11 | 回复3 | 2014-5-9 08:24:37 | 显示全部楼层 |阅读模式
在Suse 9下安装的Oracle 10.2老是出问题,隔三差五就出现oracle对其数据文件readonly.请各位能人帮忙解决一下!
具体环境: suse 9 enterprise server
oracle 10.2
SCSI
其中oracle的实例安装在SCSI存储设备上.
我怀疑linux对SCSI设备的支持有问题,或者是I/O太频繁造成的,不知正确与否.

这个是suse的message日志中出错信息:
May 16 15:54:08 linux kernel: mptbase: ioc0: IOCStatus(0x0043): SCSI Device Not There
May 16 15:54:08 linux kernel: SCSI error :return code = 0x10000
May 16 15:54:08 linux kernel: end_request: I/O error, dev sdb, sector 34415
May 16 15:54:08 linux kernel: Buffer I/O error on device sdb1, logical block 4294
May 16 15:54:08 linux kernel: lost page write due to I/O error on sdb1
May 16 15:54:08 linux kernel: mptbase: ioc0: IOCStatus(0x0043): SCSI Device Not There
May 16 15:54:08 linux kernel: SCSI error :return code = 0x10000
May 16 15:54:08 linux kernel: end_request: I/O error, dev sdb, sector 1480081575
May 16 15:54:08 linux kernel: Buffer I/O error on device sdb1, logical block 185010189
May 16 15:54:08 linux kernel: lost page write due to I/O error on sdb1
May 16 15:54:08 linux kernel: mptbase: ioc0: IOCStatus(0x0043): SCSI Device Not There
May 16 15:54:08 linux kernel: SCSI error :return code = 0x10000
May 16 15:54:08 linux kernel: end_request: I/O error, dev sdb, sector 1481668919
May 16 15:54:08 linux kernel: mptbase: ioc0: IOCStatus(0x0043): SCSI Device Not There
May 16 15:54:08 linux kernel: SCSI error :return code = 0x10000
May 16 15:54:08 linux kernel: end_request: I/O error, dev sdb, sector 1506416111
May 16 15:54:09 linux kernel: mptbase: ioc0: IOCStatus(0x0043): SCSI Device Not There
May 16 15:54:09 linux kernel: SCSI error :return code = 0x10000
May 16 15:54:09 linux kernel: end_request: I/O error, dev sdb, sector 34423
May 16 15:54:09 linux kernel: Buffer I/O error on device sdb1, logical block 4295
May 16 15:54:09 linux kernel: lost page write due to I/O error on sdb1
May 16 15:54:09 linux kernel: mptbase: ioc0: IOCStatus(0x0043): SCSI Device Not There
May 16 15:54:09 linux kernel: SCSI error :return code = 0x10000
May 16 15:54:09 linux kernel: end_request: I/O error, dev sdb, sector 1480081583
May 16 15:54:09 linux kernel: Buffer I/O error on device sdb1, logical block 185010190
May 16 15:54:09 linux kernel: lost page write due to I/O error on sdb1
May 16 15:54:09 linux kernel: mptbase: ioc0: IOCStatus(0x0043): SCSI Device Not There
May 16 15:54:09 linux kernel: SCSI error :return code = 0x10000
May 16 15:54:09 linux kernel: end_request: I/O error, dev sdb, sector 1480095375
May 16 15:54:09 linux kernel: Buffer I/O error on device sdb1, logical block 185011914
May 16 15:54:09 linux kernel: lost page write due to I/O error on sdb1
May 16 15:54:09 linux kernel: mptbase: ioc0: IOCStatus(0x0043): SCSI Device Not There
May 16 15:54:09 linux kernel: SCSI error :return code = 0x10000
May 16 15:54:09 linux kernel: end_request: I/O error, dev sdb, sector 1481668927
May 16 15:54:10 linux kernel: mptbase: ioc0: IOCStatus(0x0043): SCSI Device Not There
May 16 15:54:10 linux kernel: SCSI error :return code = 0x10000
May 16 15:54:10 linux kernel: end_request: I/O error, dev sdb, sector 1506416119
May 16 15:54:10 linux kernel: mptbase: ioc0: IOCStatus(0x0002): Busy
May 16 15:54:10 linux kernel: REISERFS: abort (device sdb1): Journal write error in flush_commit_list
May 16 15:54:10 linux kernel: REISERFS: Aborting journal for filesystem on sdb1

具体在oracle的bdump中有如下alert:
ORA-1653: unable to extend table MYUSER.MYTABLE by 128 in
tablespace MYDATA
ORA-1653: unable to extend table MYUSER.MYTABLE by 8192 in
tablespace MYDATA
事实上,我在linux中使用df -h 可以看到sdb1有900多G的free space,同时查看tablespaces的各数据文件,读写属性正常,可就是oracle罢工.
需要强调的是,tablespace中各数据文件设置了自动增长,且不限制最大空间,单个文件可以有32G左右.
虽然提示SCSI Error,但是可以进入挂载SCSI的目录并查看各文件,只是不能write.我只有remount该SCSI 设备,才可以再次正常工作,但两三天后就又出现上述问题了.
回复

使用道具 举报

千问 | 2014-5-9 08:24:37 | 显示全部楼层
总觉得是scsi的问题。
内核的版本是多少?
回复

使用道具 举报

千问 | 2014-5-9 08:24:37 | 显示全部楼层
linux 2.6
我的Suse enterprise server 9 已经安装了sp3
刚刚将内核升级到252
回复

使用道具 举报

千问 | 2014-5-9 08:24:37 | 显示全部楼层
我现在也遇到了这个问题,
rhel5.5 for oracle,
存储映射过来的两块磁盘都出现kernel: Buffer I/O error on device sdb1,请问您当时是怎么解决的?
报错如下:
Sep 24 14:50:59 db10 kernel: sd 3:0:0:0: timing out command, waited 360s
Sep 24 14:50:59 db10 kernel: sd 3:0:0:0: Unhandled error code
Sep 24 14:50:59 db10 kernel: sd 3:0:0:0: SCSI error: return code = 0x060d0000
Sep 24 14:50:59 db10 kernel: Result: hostbyte=DID_REQUEUE driverbyte=DRIVER_TIMEOUT,SUGGEST_OK
Sep 24 14:51:59 db10 kernel: sd 3:0:0:0: timing out command, waited 360s
Sep 24 14:51:59 db10 kernel: sd 3:0:0:0: Unhandled error code
Sep 24 14:51:59 db10 kernel: sd 3:0:0:0: SCSI error: return code = 0x060d0000
Sep 24 14:51:59 db10 kernel: Result: hostbyte=DID_REQUEUE driverbyte=DRIVER_TIMEOUT,SUGGEST_OK
Sep 24 14:51:59 db10 kernel: printk: 118 messages suppressed.
Sep 24 14:51:59 db10 kernel: Buffer I/O error on device sdb1, logical block 82187591
Sep 24 14:51:59 db10 kernel: lost page write due to I/O error on sdb1
Sep 24 14:51:59 db10 kernel: Buffer I/O error on device sdb1, logical block 82187592
Sep 24 14:51:59 db10 kernel: lost page write due to I/O error on sdb1
Sep 24 14:51:59 db10 kernel: Buffer I/O error on device sdb1, logical block 82187593
Sep 24 14:51:59 db10 kernel: lost page write due to I/O error on sdb1
Sep 24 14:51:59 db10 kernel: Buffer I/O error on device sdb1, logical block 82187594
Sep 24 14:51:59 db10 kernel: lost page write due to I/O error on sdb1
Sep 24 14:51:59 db10 kernel: Buffer I/O error on device sdb1, logical block 82187595
Sep 24 14:51:59 db10 kernel: lost page write due to I/O error on sdb1
Sep 24 14:51:59 db10 kernel: Buffer I/O error on device sdb1, logical block 82187596
Sep 24 14:51:59 db10 kernel: lost page write due to I/O error on sdb1
Sep 24 14:51:59 db10 kernel: Buffer I/O error on device sdb1, logical block 82187597
Sep 24 14:51:59 db10 kernel: lost page write due to I/O error on sdb1
Sep 24 14:51:59 db10 kernel: Buffer I/O error on device sdb1, logical block 82187598
Sep 24 14:51:59 db10 kernel: lost page write due to I/O error on sdb1
Sep 24 14:51:59 db10 kernel: Buffer I/O error on device sdb1, logical block 82187599
Sep 24 14:51:59 db10 kernel: lost page write due to I/O error on sdb1
Sep 24 14:51:59 db10 kernel: Buffer I/O error on device sdb1, logical block 82187600
Sep 24 14:51:59 db10 kernel: lost page write due to I/O error on sdb1
回复

使用道具 举报

您需要登录后才可以回帖 登录 | 立即注册

本版积分规则

主题

0

回帖

4882万

积分

论坛元老

Rank: 8Rank: 8

积分
48824836
热门排行