Normally when adding a new node to a RAC Cluster is fairly simple, provided one keeps the OS the same, and specifically the Linux kernel EXACTLY the same..But as always, it’s different in the real world where reality is creating havoc..
So, as it happened in our case: the addNode failed..to be more specific: it failed multiple times on various points.
This is our story… (ok, watched to many episodes of Law&Order)..
The first issue was: the voting disk was not found..
Error: [ CSSD]clssnmvDiskVerify: discovered a potential voting file [ CSSD]clssnmvDiskVerify: TOC format mismatch expected(0x634c7373 0x546f636b), found(0x0000 0x0000)
ASM was able to find the disks, however not able to determine the voting disk. Strange issue, but after some searching it had to do with the newer kmod-oracleasm (kmod-oracleasm-2.0.8-13.el6_8.x86_64) we had installed, where we had a lower version on the other nodes. For details that lead to the solution: Doc ID 1994371.1
In short: add to the oracleasm config file (/etc/sysconfig/oracleasm) the following:
(in the Doc ID it states to set this to ‘false’ but this was not logically for us..)
The above solved the issue with the voting disk.
The second issue we ran into was: ASM instance refused to start..
Error: GIMH: GIM-00104: Health check failed to connect to instance. GIM-00090: OS-dependent operation:open failed with status: 2 GIM-00091: OS failure message: No such file or directory GIM-00092: OS failure occurred at: sskgmsmr_7
When running the root.sh multiple times due to troubleshooting the first error, the file hc_* in $GRID_HOME/dbs/hc was created, but due to the voting disk error, never completed correctly.
Removing this file, and cleaning the whole oracle home of traces of this DB solved the issue, and enabled us to re-run the root.sh installation step to succes.