ITPUX技术网

交流 . 资讯 . 分享
Make progress together!
Oracle数据库DBA高级工程师培训视频
Oracle数据库培训-备份恢复-性能优化-集群容灾
mysql数据库视频教程

oracle rac其中第一个节点监听偶尔中断处理案例

内容发布:paulyi| 发布时间:2014-2-23 18:24:21
oracle rac其中第一个节点监听偶尔中断处理案例

1 概述
问题简述:2010-5-31日上午,p730a节点listener_730a资源offline,导致
应用切换到730b节点上,后来p730a节点每隔四五天左右,监听就会出现偶尔
中断现象。
操作系统:AIX 6100
数据库:oracle 10.2.0.5 rac
存储: emc-cx4

2 问题描述
2010-5-31日上午,730a节点listener_730a资源offline,导致
应用切换到p730b节点上,后来p730a节点监听每隔四五天左右会出现offline现象,需要手工去启动p730a节点监听。

3 处理过程
1. 通过以下方法,可以暂时解决这个问题
Srvctl stop listener –n 730a
Srvctl start listener –n 730a

2. 检查操作系统日志
最新日志只有到2011.5.18号,后来操作系统没有任何相关报错。

3. 查看730a节点crs日志
2011-05-30 09:19:37.294: [ CRSAPP][11834]32CheckResource error for ora.p730b.vip error code = 1
2011-05-30 09:19:37.308: [ CRSRES][11834]32In stateChanged, ora.p730b.vip target is ONLINE
2011-05-30 09:19:37.309: [ CRSRES][11834]32ora.p730b.vip on p730a went OFFLINE unexpectedly
2011-05-30 09:19:37.309: [ CRSRES][11834]32StopResource: setting CLI values
2011-05-30 09:19:37.321: [ CRSRES][11834]32Attempting to stop `ora.p730b.vip` on member `p730a`
2011-05-30 09:19:37.689: [ CRSRES][11834]32Stop of `ora.p730b.vip` on member `p730a` succeeded.
2011-05-30 09:19:37.690: [ CRSRES][11834]32ora.p730b.vip RESTART_COUNT=0 RESTART_ATTEMPTS=0
2011-05-30 09:19:37.692: [ CRSRES][11834]32ora.p730b.vip failed on p730a relocating.
2011-05-30 09:19:37.755: [ CRSRES][11834]32Attempting to start `ora.p730b.vip` on member `p730b`
2011-05-30 09:19:44.705: [ CRSRES][11834]32Start of `ora.p730b.vip` on member `p730b` failed.
2011-05-30 09:21:08.879: [ CRSAPP][11841]32CheckResource error for ora.p730a.vip error code = 1
2011-05-30 09:21:08.883: [ CRSRES][11841]32In stateChanged, ora.p730a.vip target is ONLINE
2011-05-30 09:21:08.883: [ CRSRES][11841]32ora.p730a.vip on p730a went OFFLINE unexpectedly
2011-05-30 09:21:08.883: [ CRSRES][11841]32StopResource: setting CLI values
2011-05-30 09:21:08.903: [ CRSRES][11841]32Attempting to stop `ora.p730a.vip` on member `p730a`
2011-05-30 09:21:09.280: [ CRSRES][11841]32Stop of `ora.p730a.vip` on member `p730a` succeeded.
2011-05-30 09:21:09.280: [ CRSRES][11841]32ora.p730a.vip RESTART_COUNT=0 RESTART_ATTEMPTS=0
2011-05-30 09:21:09.283: [ CRSRES][11841]32ora.p730a.vip failed on p730a relocating.
2011-05-30 09:21:09.321: [ CRSRES][11841]32StopResource: setting CLI values
2011-05-30 09:21:09.330: [ CRSRES][11841]32Attempting to stop `ora.730a.LISTENER_P730A.lsnr` on member `p730a`
2011-05-30 09:22:26.511: [ CRSRES][11841]32Stop of `ora.p730a.LISTENER_P730A.lsnr` on member `p730a` succeeded.
2011-05-30 09:22:26.527: [ CRSRES][11841]32Attempting to start `ora.730a.vip` on member `p730b`
2011-05-30 09:22:28.006: [ CRSRES][11841]32Start of `ora.p730a.vip` on member `p730b` succeeded.
可以看到p730a节点监听offline主要原因是由于p730a节点 vip offline,然后p730a节点的vip资源自动切换到p370b节点。

4. 打开debug对vip资源进行trace
crsctl debug log res "ora.730a.vip:5"
产生的trace文件放在$ORA_CRS_HOME/log/730a/目录下

5. 根据metalink文档ID1297867.1
根据以下步骤:修改racgvip脚本
1. Stop all node applications.
% srvctl stop nodeapps -n <hostname>

2. Backup then Modify the racgvip script. .

Change:
# timeout of ping in number of loops (1 sec)
PING_TIMEOUT=" -c 1 -w 1"

To:
# timeout of ping in number of loops (3 sec)
PING_TIMEOUT=" -c 1 -w 3"

3. Start the node applications and other necessary resources.
% srvctl start nodeapps -n <hostname>

6. 关闭debug
crsctl debug log res "ora.730a.vip:0"
后来打电话给客户,客户说通过修改racgvip脚本后, p730a监听中断问题没有再出现过。

4 结论和建议
对于比较异常的crs问题,可以用debug来跟踪产生log,从而确定问题所在。

打开debug
crsctl debug log res "ora.730a.vip:5"
crsctl debug log res "ora.730b.vip:5"

关闭debug
crsctl debug log res "ora.730a.vip:0"
crsctl debug log res "ora.730b.vip:0"


上一篇:oracle数据库对ASM dgdata 磁盘组增加disk实施步骤
下一篇:Oracle DataGuard容灾出现FAL Message In Alert.log When No Gap In Standby
回复

使用道具 举报

内容发布:xjcydf909| 发布时间:2014-10-24 13:20:29
学习学习,3qs
回复 支持 反对

使用道具 举报

内容发布:njrq| 发布时间:2014-11-7 16:47:03
bug复bug,bug何其多
回复 支持 反对

使用道具 举报

1框架
您需要登录后才可以回帖 登录 | 立即注册

本版积分规则

快速回复 返回顶部 返回列表