Rac One Losing Connection With Relocate..Or..Where The Listener Is Losing It..

Recently we have expanded our Rac One Node with an additional node..So we have a two-node Rac One. What’s in a name, right?

This was an ordeal in itself, which I could have blogged about,  however it was such an unique situation with so many causes it would probably not contribute much. It helped me gaining even more knowledge about the differences between Rac and Rac One installations, so at least it was not all for naught.

When we finally added the node to the Rac One installation, the last issue arose: the actually relocation of the existing databases was failing! A new created database as a test AFTER the 2nd node was added worked perfectly. But when relocating an existing database (existing as in: before the node was added) was relocated, but the connections dropped anyway! This was not so Transparent as advertised, but after quite some searching the issue was found. And since in my searches on the Internet I couldn’t find a reference to what the steps and prerequisites exactly are for a successful fail-over on a RAC One environment, I decided to put my experiences for you guys on-line..

The current situation:

* A two node RAC One system.

* Node 1 has already 12 databases running on it, never ever where these failed over, and seemingly they are a mix of single instance databases, as well as RAC (*sigh*) databases. How this came to be? Long story, but the keyword is: cloning.

* Node two is up and running,fully configured in the Cluster but  containing no databases (yet).

Overview of running processes:

On Node 01:

$/oracle/grid/bin/crsctl stat res -t
--------------------------------------------------------------------------------
NAME           TARGET  STATE        SERVER                   STATE_DETAILS       
--------------------------------------------------------------------------------
Local Resources
--------------------------------------------------------------------------------
ora.DATA.dg
               ONLINE  ONLINE       node01                                 
               ONLINE  ONLINE       node02                                 
ora.LISTENER.lsnr
               ONLINE  ONLINE       node01                                 
               ONLINE  ONLINE       node02                                 
ora.asm
               ONLINE  ONLINE       node01             Started             
               ONLINE  ONLINE       node02             Started             
ora.gsd
               OFFLINE OFFLINE      node01                                 
               OFFLINE OFFLINE      node02                                 
ora.net1.network
               ONLINE  ONLINE       node01                                 
               ONLINE  ONLINE       node02                                 
ora.ons
               ONLINE  ONLINE       node01                                 
               ONLINE  ONLINE       node02                                 
ora.registry.acfs
               ONLINE  ONLINE       node01                                 
               ONLINE  ONLINE       node02                                 
--------------------------------------------------------------------------------
Cluster Resources
--------------------------------------------------------------------------------
ora.LISTENER_SCAN1.lsnr
      1        ONLINE  ONLINE       node01                                 
ora.node01.vip
      1        ONLINE  ONLINE       node01                                 
ora.node02.vip
      1        ONLINE  ONLINE       node02                                 
ora.cvu
      1        ONLINE  ONLINE       node01                                 
ora.exp06.db
      2        ONLINE  ONLINE       node01             Open                
ora.exp06.experiment06.svc
      1        ONLINE  ONLINE       node01                                 
ora.exp07.db
      2        ONLINE  ONLINE       node02             Open                
ora.exp07.experiment07.svc
      1        ONLINE  ONLINE       node02                                 
ora.oc4j
      1        ONLINE  ONLINE       node01                                 
ora.scan1.vip
      1        ONLINE  ONLINE       node01                                 
[node01:oracle:+ASM1]$ 

Note:
Database exp07 is running on node 2!
Database exp06 is running on node 1!

On Node 02:

$/oracle/grid/bin/crsctl stat res -t
--------------------------------------------------------------------------------
NAME           TARGET  STATE        SERVER                   STATE_DETAILS       
--------------------------------------------------------------------------------
Local Resources
--------------------------------------------------------------------------------
ora.DATA.dg
               ONLINE  ONLINE       node01                                 
               ONLINE  ONLINE       node02                                 
ora.LISTENER.lsnr
               ONLINE  ONLINE       node01                                 
               ONLINE  ONLINE       node02                                 
ora.asm
               ONLINE  ONLINE       node01             Started             
               ONLINE  ONLINE       node02             Started             
ora.gsd
               OFFLINE OFFLINE      node01                                 
               OFFLINE OFFLINE      node02                                 
ora.net1.network
               ONLINE  ONLINE       node01                                 
               ONLINE  ONLINE       node02                                 
ora.ons
               ONLINE  ONLINE       node01                                 
               ONLINE  ONLINE       node02                                 
ora.registry.acfs
               ONLINE  ONLINE       node01                                 
               ONLINE  ONLINE       node02                                 
--------------------------------------------------------------------------------
Cluster Resources
--------------------------------------------------------------------------------
ora.LISTENER_SCAN1.lsnr
      1        ONLINE  ONLINE       node01                                 
ora.node01.vip
      1        ONLINE  ONLINE       node01                                 
ora.node02.vip
      1        ONLINE  ONLINE       node02                                 
ora.cvu
      1        ONLINE  ONLINE       node01                                 
ora.exp06.db
      2        ONLINE  ONLINE       node02             Open                
ora.exp06.experiment06.svc
      1        ONLINE  ONLINE       node02                                 
ora.exp07.db
      2        ONLINE  ONLINE       node01             Open                
ora.exp07.experiment07.svc
      1        ONLINE  ONLINE       node01                                 
ora.oc4j
      1        ONLINE  ONLINE       node01                                 
ora.scan1.vip
      1        ONLINE  ONLINE       node01                                 
[node02:oracle:+ASM2]$ 

Steps to relocate a database:

Make sure all the resources are created and available:

If database resource is not available:

$ srvctl add database -d ORCL -o /home/of/the/database/db_1

If instance resource is not available:

$ srvctl add instance -d ORCL -i ORCL1 -n server01

Service resource:

$ srvctl add service -d ORCL -s ORACLE01 -r ORCL
$ srvctl start service -d ORCL -s ORACLE01

Check the resources:

$ crsctl stat res -t|grep exp06
ora.exp06.db
ora.exp06.experiment06.svc

Make sure the connection string is TAF on the client, in the tnsnames.ora:

exp06 =
  (DESCRIPTION =
    (LOAD_BALANCE = ON)
    (FAILOVER = ON)
    (ADDRESS = (PROTOCOL = TCP)(HOST = scan-adress)(PORT = 1521))
    (CONNECT_DATA =
      (SERVICE_NAME = experiment06)
      (FAILOVER_MODE =
        (TYPE = SELECT)
        (METHOD = BASIC)
        (RETRIES = 20)
        (DELAY = 1)
      )
    )
  )

Be aware to connect to the service name, not the database!

Make sure the listener is NOT “hardcoded” in the database p/spfile!

This can be confusing and  difficult to catch when you are not aware of this (as I initially was):

The correct parameter must be:

SQL >show spparameter listener

SID      NAME                          TYPE        VALUE
-------- ----------------------------- ----------- ----------------------------
*        listener_networks             string
*        local_listener                string
*        remote_listener               string      rac-scan:1521

but when looking at the parameter setting:

SQL >show parameter listener

SID      NAME                          TYPE        VALUE
-------- ----------------------------- ----------- ----------------------------
*        listener_networks             string
*        local_listener                string      (DESCRIPTION=(ADDRESS_LIST=(
                                                   ADDRESS=(PROTOCOL=TCP)(HOST=
                                                   bogus.com)(PORT=1521))))
*        remote_listener               string      rac-scan:1521

This is correct, since the scan process is telling the instance at start time who is his listener! But the database needs to have an empty (or none at all) setting in the sp/pfile. If one is set, the scan process WON’T tell the database the new listener and the database drops of the grid while still being online. Connections will also drop, of course.

To clear a set listener string, issue the following command:

SQL> alter system set local_listener ='' scope=spfile;

When this all is set, make sure the database is a RAC One database:

$ srvctl config database -d exp06 -a
Database unique name: exp06
Database name: 
Oracle home: /oracle/base/dbhome01
Oracle user: oracle
Spfile: 
Domain: 
Start options: open
Stop options: immediate
Database role: PRIMARY
Management policy: AUTOMATIC
Server pools: exp06
Database instances: exp06
Disk Groups: 
Mount point paths: 
Services: experiment06
Type: RAC
Database is enabled
Database is administrator managed
$

The TYPE is RAC..not quite what we need, so let’s convert this database:

$ srvctl convert database -d exp06 -c RACONENODE -i exp06

Database unique name: exp06
Database name: 
Oracle home: /oracle/base/erpccst0/dbhome01
Oracle user: oracle
Spfile: 
Domain: 
Start options: open
Stop options: immediate
Database role: PRIMARY
Management policy: AUTOMATIC
Server pools: exp06
Database instances: 
Disk Groups: 
Mount point paths: 
Services: experiment06
Type: RACOneNode
Online relocation timeout: 30
Instance name prefix: exp06
Candidate servers: bedc-odb1001
Database is enabled
Database is administrator managed

Before the relocation can be successful, check the following values of the sp/pfile:

*.cluster_database=TRUE
*.audit_file_dest='/oracle/base/exp06/admin/exp06/adump
*.diagnostic_dest='/oracle/base/exp06'
exp06_1.core_dump_dest='/oracle/base/exp06/diag/rdbms/exp06/exp06_1/cdump'
exp06_2.core_dump_dest='/oracle/base/exp06/diag/rdbms/exp06/exp06_2/cdump'
exp06_1.undo_tablespace='UNDOTBS1'
exp06_2.undo_tablespace='UNDOTBS2'
*.remote_listener='rac-scan-name:1521'
REMOVE:
local_listener
service_names

 

 If these steps above where executed while the “original” database was still running, you MAY need to stop/start the database. Use the srvctl utilities for this.

Advertisements

About GemsOfProgramming

Beeing a previously enthusiastic Java programmer, I rolled into the Oracle Database Administration world. It turned out I got a knack for this, and since approx. 2000 I'm a full time DBA. My experiences touches lot of Oracle products like Forms and Reports 9/10, JDAPI, Application Server, Weblogic Fusion and of course: Oracle Enterprise Databases, JavaFX, Swing and other Java components.
This entry was posted in Databases and tagged , , . Bookmark the permalink.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s