We were in the process of Exadata Builds for one of our critical EBS application during which we cloned EBS applications from legacy servers to Exadata. Post build, we needed to perform RAC service failover test for the recently migrated EBS application. Our architecture was one-node application tier and 2-node database tier.
Steps performed for testing RAC failover:
- First, stop service on one RAC instance
srvctl stop service -d -s -instance
srvctl status service -s -d
- Stop the RAC instance
srvctl stop instance -d -i
- Next step is to test the failover (verify CM is running, forms working,etc)
- Now, bring the RAC service/instance up and repeat steps 1,2,3 for the second instance
During our test, forms was working fine (either it was connected to same DB node or re-establishing new session with the surviving instance). However it was CM failover that did not work.
In Ideal situation, CM should fail over to the other DB instance which is alive. CM was working only when we restarted CM( using adcmctl.sh) and apps listener separately after the RAC instance was taken down.
On thorough troubleshooting, we figured out that the following profile options were incorrect.
‘Concurrent:TM Transport Type’ was set to PIPE
‘Concurrent:PCP Instance Check’ was set to ON
“PIPE” means that the client and the transaction manager must both be on the same database instance to be able to communicate, whereas “QUEUE” means that client communicate via Advanced Queue mechanism eliminating restriction to one instance.
Therefore, for non-RAC and 11i Release, default is PIPE, whereas for RAC and in R12, default is to use QUEUE.
When the profile ‘Concurrent:PCP Instance Check’ is set to ON, managers try to move to a secondary Concurrent Tier when a RAC Instance fails. In our case, there was no second CM node, so we kept it OFF.
So please ensure below are correctly set so that CM failover works.
- Modify the Setting cp_twotask in the Contextfile of the Concurrent Tier to load balanced alias (‘cp_twotask=_BALANCE’)
- Set Profile “Concurrent:TM Transport Type” to QUEUE
- Set Profile “Concurrent:PCP Instance Check” to OFF(In case there is no secondary CM node)