ITPUX技术网

交流 . 资讯 . 分享
Make progress together!
Oracle数据库DBA高级工程师培训视频
Oracle数据库培训-备份恢复-性能优化-集群容灾
mysql数据库视频教程

Oracle12cRAC安装遭遇CLSRSC-507: The root script cannot proceed on this node

内容发布:风哥| 发布时间:2015-12-7 16:35:27
Oracle RAC 12c安装过程遭遇CLSRSC-507: The root script cannot proceed on this node <node-n> (id:1919825.1)


APPLIES TO:
Oracle Database - Enterprise Edition - Version 12.1.0.2 and later

Information in this document applies to any platform.


PURPOSE
The note lists known issues regarding the following error:
CLSRSC-507: The root script cannot proceed on this node <non-first_node> because either the first-node operations have not completed on node <first_node> or there was an error in obtaining the status of the first-node operations.


DETAILSCase 1: root script didn't succeed on first node
Grid Infrastructure root script (root.sh or rootupgrade.sh) needs to be completed successfully on node1 or first node before it can be ran on other nodes; first node is the one on which the runInstall/config.sh ran, this is new in 12.1.0.2.
If this is the case, complete root script on node1 before running it on other nodes.

Case 2: root script completed on first node but other nodes fail to obtain the status due to ocrdump issue
In this case, it's confirmed that root script finished on node1:
<NEW_GI_HOME>/cfgtoollogs/crsconfig/rootcrs_<node>_<timestamp>.log
2014-08-22 10:23:10: Invoking "/opt/ogrid/12.1.0.2/bin/cluutil -exec -ocrsetval -key SYSTEM.rootcrs.checkpoints.firstnode -value SUCCESS"
2014-08-22 10:23:10: trace file=/opt/oracle/crsdata/inari/crsconfig/cluutil0.log
2014-08-22 10:23:10: Executing cmd: /opt/ogrid/12.1.0.2/bin/cluutil -exec -ocrsetval -key SYSTEM.rootcrs.checkpoints.firstnode -value SUCCESS
2014-08-22 10:23:10: Succeeded in writing the key pair (SYSTEM.rootcrs.checkpoints.firstnode:SUCCESS) to OCR
2014-08-22 10:23:10: Executing cmd: /opt/ogrid/12.1.0.2/bin/clsecho -p has -f clsrsc -m 325
2014-08-22 10:23:10: Command output:
> CLSRSC-325: Configure Oracle Grid Infrastructure for a Cluster ... succeeded
>End Command output
2014-08-22 10:23:10: CLSRSC-325: Configure Oracle Grid Infrastructure for a Cluster ... succeeded

And root script fails on other nodes as ocrdump failed
<NEW_GI_HOME>/cfgtoollogs/crsconfig/rootcrs_<node>_<timestamp>.log
2014-09-04 13:45:34: ASM_DISKS=ORCL:OCR01,ORCL:OCR02,ORCL:OCR03
....
2014-09-04 13:46:04: Check the existence of global ckpt 'checkpoints.firstnode'
2014-09-04 13:46:04: setting ORAASM_UPGRADE to 1
2014-09-04 13:46:04: Invoking "/product/app/12.1.0.2/grid/bin/cluutil -exec -keyexists -key checkpoints.firstnode"
2014-09-04 13:46:04: trace file=/product/app/grid/crsdata/sipr0-db04/crsconfig/cluutil8.log
2014-09-04 13:46:04: Running as user grid: /product/app/12.1.0.2/grid/bin/cluutil -exec -keyexists -key checkpoints.firstnode
2014-09-04 13:46:04: s_run_as_user2: Running /bin/su grid -c ' echo CLSRSC_START; /product/app/12.1.0.2/grid/bin/cluutil -exec -keyexists -key checkpoints.firstnode '
2014-09-04 13:46:05: Removing file /tmp/fileRiu5NI
2014-09-04 13:46:05: Successfully removed file: /tmp/fileRiu5NI
2014-09-04 13:46:05: pipe exit code: 256
2014-09-04 13:46:05: /bin/su exited with rc=1
2014-09-04 13:46:05: oracle.ops.mgmt.rawdevice.OCRException: PROC-32: Cluster Ready Services on the local node is not running Messaging error [gipcretConnectionRefused] [29]
2014-09-04 13:46:05: Cannot get OCR key with CLUUTIL, try using OCRDUMP.
2014-09-04 13:46:05: Check OCR key using ocrdump
2014-09-04 13:46:22: ocrdump output: PROT-302: Failed to initialize ocrdump
2014-09-04 13:46:22: The key pair with keyname: SYSTEM.rootcrs.checkpoints.firstnode does not exist in OCR.
2014-09-04 13:46:22: Checking a remote host sipr0-db03 for reachability...


Case 2.1 ocrdump fails due to error AMDU-00201 and AMDU-00200
<ADR_HOME>/crs/<node>/crs/trace/ocrdump_<pid>.trc
2014-09-04 13:46:14.044274 : OCRASM: proprasmo: ASM instance is down. Proceed to open the file in dirty mode.
CLWAL: clsw_Initialize: Error [32] from procr_init_ext
CLWAL: clsw_Initialize: Error [PROCL-32: Oracle High Availability Services on the local node is not running Messaging error [gipcretConnectionRefused] [29]] from procr_init_ext
2014-09-04 13:46:14.050831 : GPNP: clsgpnpkww_initclswcx: [at clsgpnpkww.c:351] Result: (56) CLSGPNP_OCR_INIT. (:GPNP01201:)Failed to init CLSW-OLR context. CLSW Error (3): CLSW-3: Error in the cluster registry (OCR) layer. [32] [PROCL-32: Oracle High Availability Services on the local node is not running Messaging error [gipcretConnectionRefused] [29]]
2014-09-04 13:46:14.093544 : OCRASM: proprasmo: Error [13] in opening the GPNP profile. Try to get offline profile
2014-09-04 13:46:16.210708 : OCRRAW: kgfo_kge2slos error stack at kgfolclcpi1: AMDU-00200: Unable to read [32768] bytes from Disk N0050 at offset [140737488355328]
AMDU-00201: Disk N0050: '/dev/sdg'
AMDU-00200: Unable to read [32768] bytes from Disk N0049 at offset [140737488355328]
AMDU-00201: Disk N0049: '/dev/sdf'
AMDU-00200: Unable to read [32768] bytes from Disk N0048 at offset [140737488355328]
AMDU-00201: Disk N0048: '/dev/sde'
AMDU-00200: Unable to read [32768] bytes from Disk N0035 at offset [140737488355328]
AMDU-00201: Disk N0035: '/dev/sdaw'
AMDU-00200: Unable to read [32768] bytes from Disk N0024 at offset [140737488355328]
AMDU-00201: Disk N0024: '/dev/sdaq'
....
2014-09-04 13:46:16.212934 : OCRASM: proprasmo: Failed to open file in dirty mode
2014-09-04 13:46:16.212964 : OCRASM: proprasmo: dgname is [OCRVOTE] : discoverystring []
2014-09-04 13:46:16.212990 : OCRASM: proprasmo: Error in open/create file in dg [OCRVOTE]
OCRASM: SLOS : SLOS: cat=8, opn=kgfolclcpi1, dep=200, loc=kgfokge
2014-09-04 13:46:16.213075 : OCRASM: ASM Error Stack :
....
2014-09-04 13:46:22.690905 : OCRASM: proprasmo: kgfoCheckMount returned [7]
2014-09-04 13:46:22.690933 : OCRASM: proprasmo: The ASM instance is down
2014-09-04 13:46:22.692150 : OCRRAW: proprioo: Failed to open [+OCRVOTE/sipr0-dbhv1/OCRFILE/registry.255.857389203]. Returned proprasmo() with [26]. Marking location as UNAVAILABLE.
2014-09-04 13:46:22.692204 : OCRRAW: proprioo: No OCR/OLR devices are usable
2014-09-04 13:46:22.692239 : OCRRAW: proprinit: Could not open raw device
2014-09-04 13:46:22.692561 : default: a_init:7!: Backend init unsuccessful : [26]
2014-09-04 13:46:22.692777 : OCRDUMP: Failed to initailized OCR context. Error [PROC-26: Error while accessing the physical storage
] [26].
2014-09-04 13:46:22.692822 : OCRDUMP: Failed to initialize ocrdump stage 2
2014-09-04 13:46:22.692864 : OCRDUMP: Exiting [status=failed]...


Solution:
The solution is to apply patch 18456643, then re-run root script.

一般遇到这种情况的比较多,处理方法是:
完全删除卸载oracle,重新进行安装,安装步骤如下:
    1、正常安装grid集群件
    2、在提升执行root脚本的时候,两个节点均不执行该步骤
    3、分别在两节点安装补丁18456643
    4、按照顺序在两个节点执行root.sh脚本,12c rac集群件安装成功:


Case 2.2 ocrdump fails: AMDU-00210 AMDU-00205 AMDU-00201 AMDU-00407 asmlib error asm_close asm_open
<ADR_HOME>/crs/<node>/crs/trace/ocrdump_<pid>.trc
OCRASM: proprasmo: ASM instance is down. Proceed to open the file in dirty mode.
2014-09-09 13:52:04.131609 : OCRRAW: kgfo_kge2slos error stack at kgfolclcpi1: AMDU-00210: No disks found in diskgroup CRSGRP
AMDU-00210: No disks found in diskgroup CRSGRP
AMDU-00205: Disk N0033 open failed during deep discovery.
AMDU-00201: Disk N0033: 'ORCL:REDOA'
AMDU-00407: asmlib error!! function = [asm_close], error = [0], mesg = [Invalid argument]
AMDU-00407: asmlib error!! function = [asm_open], error = [0], mesg = [Operation not permitted]
....
2014-09-09 13:52:04.131691 : OCRRAW: kgfoOpenDirty: dg=CRSGRP diskstring= filename=/opt/oracle/crsdata/drcsvr713/output/tmp_amdu_ocr_CRSGRP_09_09_2014_13_52_04
....
2014-09-09 13:52:04.131756 : OCRRAW: Category: 8
2014-09-09 13:52:04.131767 : OCRRAW: DepInfo: 210
....
OCRRAW: proprioo: No OCR/OLR devices are usable
OCRRAW: proprinit: Could not open raw device
default: a_init:7!: Backend init unsuccessful : [26]
OCRDUMP: Failed to initailized OCR context. Error [PROC-26: Error while accessing the physical storage] [26].
OCRDUMP: Failed to initialize ocrdump stage 2
OCRDUMP: Exiting [status=failed]...


Solution:
The cause is that asmlib is used but not properly configured as confirmed by the output of the following commands on all nodes:
/etc/init.d/oracleasm listdisks
/etc/init.d/oracleasm scandisks
/etc/init.d/oracleasm listdisks
/etc/init.d/oracleasm listdisks | xargs /etc/init.d/oracleasm querydisk -d
/etc/init.d/oracleasm status
/usr/sbin/oracleasm configure
ls -l /dev/oracleasm/disks/*
rpm -qa | grep oracleasm
uname -a


It's recommended to use AFD (ASM Filter Driver) instead of ASBLIB, but if ASMLIB must be used, fix the misconfiguration, then re-run root script.


Case 2.3 ocrdump fails as amdu core dumped
<ADR_HOME>/crs/<node>/crs/trace/ocrdump_<pid>.trc
2014-08-27 14:34:33.077433 : OCRRAW: kgfo_kge2slos error stack at kgfolclcpi1: AMDU-00210: No disks found in diskgroup QUORUM
AMDU-00210: No disks found in diskgroup QUORUM
....
2014-08-27 14:34:39.262032 : OCRASM: proprasmo: kgfoCheckMount returned [7]
2014-08-27 14:34:39.262041 : OCRASM: proprasmo: The ASM instance is down
2014-08-27 14:34:39.262521 : OCRRAW: proprioo: Failed to open [+QUORUM/wrac-cl-tor/OCRFILE/registry.255.856261165]. Returned proprasmo() with [26]. Marking location as UNAVAILABLE.
2014-08-27 14:34:39.262540 : OCRRAW: proprioo: No OCR/OLR devices are usable
2014-08-27 14:34:39.262552 : OCRRAW: proprinit: Could not open raw device
2014-08-27 14:34:39.262668 : default: a_init:7!: Backend init unsuccessful : [26]
2014-08-27 14:34:39.262743 : OCRDUMP: Failed to initailized OCR context. Error [PROC-26: Error while accessing the physical storage
] [26].
2014-08-27 14:34:39.262760 : OCRDUMP: Failed to initialize ocrdump stage 2

amdu command core dumps:
$ amdu -diskstring 'ORCL:*'
amdu_2014_09_09_14_35_43/
amdu: ossdebug.c:1136: ossdebug_init_diag: Assertion `0' failed.
Aborted (core dumped)


Solution:
At the time of this writing, the issu s still being worked in bug 19592048, engage Oracle Support for further help.


Case 2.4 same disk name points to different storage on different node
<ADR_HOME>/crs/<node>/crs/trace/ocrdump_<pid>.trc
2014-09-10 13:12:53.429460 : OCRASM: proprasmo: Error [13] in opening the GPNP profile. Try to get offline profile
2014-09-10 13:12:53.435300 : OCRRAW: kgfo_kge2slos error stack at kgfolclcpi1: AMDU-00210: No disks found in diskgroup DATA01
AMDU-00210: No disks found in diskgroup DATA01

amdu command output on node1
Disk Path: /dev/asm-data001
Unique Disk ID:
Disk Label:
Physical Sector Size: 512 bytes
Disk Size: 409600 megabytes
Group Name: DATA01
Disk Name: DATA01_0000
Failure Group Name: DATA01_0000

amdu command output on node2
Disk Path: /dev/asm-data001
Unique Disk ID:
Disk Label:
Physical Sector Size: 512 bytes
Disk Size: 409600 megabytes
** NOT A VALID ASM DISK HEADER. BAD VALUE IN FIELD blksize_kfdhdb **


Solution:
The solution is to engage SysAdmin to fix the disk setup issue.


Case 2.5 same storage sub-system are shared by different clusters and same diskgroup name exists in more than one cluster
<ADR_HOME>/crs/<node>/crs/trace/ocrdump_<pid>.trc
2015-07-17 16:57:00.532160 : OCRRAW: AMDU-00211: Inconsistent disks in diskgroup OCR


Solution:
The issue was investigated in bug 21469989, the cause is that multiple clusters are having the same diskgroup name and seeing the same shared disks, the workaround is to change diskgroup name for the new cluster.
An example will be that both cluster1 and cluster2 are seeing the same physical disks /dev/mappers/disk1-10, disk1-5 are allocated to cluster1 and disk6-10 are allocated to cluster2, however, both cluster are trying to use the same diskgroup name dgsys.
Ref: BUG 21469989 - CLSRSC-507 ROOT.SH FAILING ON NODE 2 WHEN CHECKING GLOBAL CHECKPOINT  

Case 2.6 root user is seeing the same physical disks multiple times because of different path
<ADR_HOME>/crs/<node>/crs/trace/ocrdump_<pid>.trc
2015-07-17 16:57:00.532160 : OCRRAW: AMDU-00211: Inconsistent disks in diskgroup OCR


Solution:
The solution is to ensure disk string is set correctly and root user is only seeing the same physical disk once.
Ref: BUG 21164225 - OCRDUMP FAILS WITH AMDU-211 ONLY ON NORMAL REDUNDANCY  


Case 3: root script completed on first node but other nodes fail to obtain the status as ocrdump wasn't executed
In this case, it's confirmed that root script finished on node1:
<NEW_GI_HOME>/cfgtoollogs/crsconfig/rootcrs_<node>_<timestamp>.log
CLSRSC-325: Configure Oracle Grid Infrastructure for a Cluster ... succeeded

And root script fails on other nodes as ocrdump wasn't executed:
2014-08-28 17:53:55: Check the existence of global ckpt 'checkpoints.firstnode'
2014-08-28 17:53:55: setting ORAASM_UPGRADE to 1
2014-08-28 17:53:55: Invoking "/opt/12.1.0.2/grid/bin/cluutil -exec -keyexists -key checkpoints.firstnode"
2014-08-28 17:53:55: trace file=/opt/oracle/crsdata/racnode2/crsconfig/cluutil3.log
2014-08-28 17:53:55: Running as user oracle: /opt/12.1.0.2/grid/bin/cluutil -exec -keyexists -key checkpoints.firstnode
2014-08-28 17:53:55: s_run_as_user2: Running /bin/su oracle -c ' echo CLSRSC_START; /opt/12.1.0.2/grid/bin/cluutil -exec -keyexists -key checkpoints.firstnode '
2014-08-28 17:53:56: Removing file /tmp/fileZCubj2
2014-08-28 17:53:56: Successfully removed file: /tmp/fileZCubj2
2014-08-28 17:53:56: pipe exit code: 0                   ====>>>> cluutil failed with PROC-32 but exit code 0
2014-08-28 17:53:56: /bin/su successfully executed
2014-08-28 17:53:56: oracle.ops.mgmt.rawdevice.OCRException: PROC-32: Cluster Ready Services on the local node is not running Messaging error [gipcretConnectionRefused] [29]
2014-08-28 17:53:56: Checking a remote host dblab01 for reachability...
....
2014-08-28 17:53:57: CLSRSC-507: The root script cannot proceed on this node dblab02 because either the first-node operations have not completed on node dblab01 or there was an error in obtaining the status of the first-node operations.


cluutil trace <ORACLE_BASE>/crsdata/racnode2/crsconfig/cluutil3.log confirms it failed:
[main] [ 2014-08-29 17:40:46.750 EDT ] [OCR.<init>:278] ocr Error code = 32
[main] [ 2014-08-29 17:40:46.750 EDT ] [ClusterExecUtil.executeCmd:168] Exception caught: PROC-32: Cluster Ready Services on the local node is not running Messaging error [gipcretConnectionRefused] [29]
[main] [ 2014-08-29 17:40:46.750 EDT ] [ClusterUtil.main:236] ClusterUtil.execute rc: 1


The issue was investigated in bug 19570598:
BUG 19570598 - ROOT.SH FAILS ON NODE2 WHILE CHECKING GLOBAL FIRST NODE CHECKPOINT


上一篇:Oracle中Inventory目录作用以及如何重建Inventory目录
下一篇:Oracle备份RMAN命令database plus archivelog不同写法导致备份集位置不同
189070296,150201289

专业提供Oracle数据库服务、主机、存储、备份、中间件等相关技术支持服务,QQ号:176140749
关注ITPUX技术网微信公众号itpux_com  ,了解本站最新技术资料的分享.

欢迎加QQ群,提供超多高质量Oracle/Unix/Linux技术文档与视频教程的下载。

Oracle/MySQL/Linux群4-5:189070296  150201289  
Oracle/MySQL/Linux群6-8:244609803   522261684   522651731
备注:请勿重复加群,另请注明 from itpux

加群分享视频教程部分如下:

1、公开课视频:Oracle/MySQL数据库工程师职业发展前景讲解(免费)
http://edu.51cto.com/course/7015.html

2、51CTO学院Oracle数据库高级工程师培训(高薪就业.课程介绍)
http://edu.51cto.com/px/train/131?xiaotu

3、Oracle DBA数据库高级工程师培训视频课程1.1(系列78套+七大阶段+上千案例)
套餐视频地址: http://edu.51cto.com/topic/1121.html

4、MySQL数据库(终身门徒)套餐:http://edu.51cto.com/sd/1e1a6

回复

使用道具 举报

1框架
您需要登录后才可以回帖 登录 | 立即注册

本版积分规则

快速回复 返回顶部 返回列表