Cold ACE ?
Ever seen an ACE in standby cold state? This means the standby ACE has not been able to synchronize properly with the active ACE. It usually happen when the standby ACE is missing some certificates, keys or script files referrenced by the active ACE. This usually happen after an RMA. In that state, the ACE won’t be able to perform a stateful failover and all the sessions would be lost should a failover occur.
axsl01/Admin# sh ft group status FT Group : 1 Configured Status : in-service Maintenance mode : MAINT_MODE_OFF My State : FSM_FT_STATE_ACTIVE Peer State : FSM_FT_STATE_STANDBY_HOT Peer Id : 1 No. of Contexts : 1 Running cfg sync status : Running configuration sync has completed Startup cfg sync status : Startup configuration sync has completed FT Group : 2 Configured Status : in-service Maintenance mode : MAINT_MODE_OFF My State : FSM_FT_STATE_ACTIVE Peer State : FSM_FT_STATE_STANDBY_COLD Peer Id : 1 No. of Contexts : 1 Running cfg sync status : Peer in Cold State. Incremental Sync Failure: script file not configured Startup cfg sync status : Peer in Cold State. Incremental Sync Failure: script file not configured
First step is, of course, to copy the missing files to the standby ACE. Script files have to be copied via tftp, certificates and keys may be copied via tftp or terminal copy/paste. In newer firmware releases you can easily identified what caused the ACE to go into that cold state:
axsl02/C1# sh ft config-error Mon Aug 13 09:19:18 UTC 2012 `script file 2 test.tcl` Error: Unable to locate the script file 'test.tcl'
Once the files are copied, you have to bring back the standby ACE to the standby hot state. As of now, I know four ways to do that. Actually only three are working, the fourth one (provided by Cisco) never worked for me.
Hard way
- Reboot the standby ACE.
axsl01/Admin# reload This command will reboot the system Save configurations for all the contexts. Save? [yes/no]: [yes] yes Perform system reload. [yes/no]: [yes] yes
In this case, you have to ensure all the contexts are active or at least in standby hot state on the other ACE. It takes a “very” long time (ACE boot time is +/- 10 minutes) before you can validate it worked.
Quick and dirty way – aka I’m feeling lucky today
- Go to the Admin context on the active ACE.
- Quickly deactivate/activate the ft group of the context you want to fix.
axsl01/Admin# conf t Enter configuration commands, one per line. End with CNTL/Z. axsl01/Admin(config)# ft group 2 axsl01/Admin(config-ft-group)# no inservice axsl01/Admin(config-ft-group)# inservice axsl01/Admin(config-ft-group)# end axsl01/Admin# sh ft group 2 status FT Group : 2 Configured Status : in-service Maintenance mode : MAINT_MODE_OFF My State : FSM_FT_STATE_ACTIVE Peer State : FSM_FT_STATE_STANDBY_BULK Peer Id : 1 No. of Contexts : 1 Running cfg sync status : Running configuration sync has completed Startup cfg sync status : Startup configuration sync has completed axsl01/Admin# sh ft group 2 status (after some time) FT Group : 2 Configured Status : in-service Maintenance mode : MAINT_MODE_OFF My State : FSM_FT_STATE_ACTIVE Peer State : FSM_FT_STATE_STANDBY_HOT Peer Id : 1 No. of Contexts : 1 Running cfg sync status : Running configuration sync has completed Startup cfg sync status : Startup configuration sync has completed
If you’re quick enough, the active ACE will keep serving the active sessions, only the new sessions will be refused during the period of time the context is not in service, few ms if you’re quick enough.
This method may still be unacceptable in some case and might be catastrophic if you (or your terminal client) are not quick enough.
Cautious way – aka exp-Networks way
- Make the Admin context active on the standby ACE, perform a switchover of the Admin context if required.
- Disable the running-config synchronization for the Admin context.
- Deactivate the ft group of the context you want to fix. Since you’ve disabled the running-config synchronization, the command only take effect locally, in other words the active context on the other ACE remains active.
- Activate the ft group again and wait until the context stabilize in standby hot state.
- Enable the running-config synchronization for the Admin context again.
- Perform a switchover of the Admin context if required.
axsl02/Admin# sh ft group brief FT Group ID: 1 My State:FSM_FT_STATE_STANDBY_HOT Peer State:FSM_FT_STATE_ACTIVE Context Name: Admin Context Id: 0 Running Cfg Sync Status: Successful FT Group ID: 2 My State:FSM_FT_STATE_STANDBY_COLD Peer State:FSM_FT_STATE_ACTIVE Context Name: C1 Context Id: 1 Running Cfg Sync Status: Successful axsl02/Admin# ft switchover 1 This command will cause card to switchover (yes/no)? [no] yes axsl02/Admin# NOTE: Configuration mode is enabled on all sessions NOTE: Configuration mode has been disabled on all sessions NOTE: Configuration mode is enabled on all sessions axsl02/Admin# sh ft group brief FT Group ID: 1 My State:FSM_FT_STATE_ACTIVE Peer State:FSM_FT_STATE_STANDBY_HOT Context Name: Admin Context Id: 0 Running Cfg Sync Status: Successful FT Group ID: 2 My State:FSM_FT_STATE_STANDBY_COLD Peer State:FSM_FT_STATE_ACTIVE Context Name: C1 Context Id: 1 Running Cfg Sync Status: Successful axsl02/Admin# conf t Enter configuration commands, one per line. End with CNTL/Z. axsl02/Admin(config)# no ft auto-sync running-config axsl02/Admin(config)# ft group 2 axsl02/Admin(config-ft-group)# no inservice axsl02/Admin(config-ft-group)# do sh ft group brief FT Group ID: 1 My State:FSM_FT_STATE_ACTIVE Peer State:FSM_FT_STATE_STANDBY_HOT Context Name: Admin Context Id: 0 Running Cfg Sync Status: Successful FT Group ID: 2 My State:FSM_FT_STATE_INIT Peer State:FSM_FT_STATE_UNKNOWN Context Name: C1 Context Id: 1 Running Cfg Sync Status: Successful
We can check on the other ACE that context C1 is still active.
axsl01/Admin# sh ft group brief FT Group ID: 1 My State:FSM_FT_STATE_STANDBY_HOT Peer State:FSM_FT_STATE_ACTIVE Context Name: Admin Context Id: 0 Running Cfg Sync Status: Successful FT Group ID: 2 My State:FSM_FT_STATE_ACTIVE Peer State:FSM_FT_STATE_INIT Context Name: C1 Context Id: 1 Running Cfg Sync Status: Successful
Back on the other ACE to bring the context back into service.
axsl02/Admin(config-ft-group)# inservice axsl02/Admin(config-ft-group)# ft auto-sync running-config axsl02/Admin(config)# NOTE: Configuration mode has been disabled on all sessions NOTE: Configuration mode is enabled on all sessions axsl02/Admin(config)# do sh ft group brief FT Group ID: 1 My State:FSM_FT_STATE_ACTIVE Peer State:FSM_FT_STATE_STANDBY_HOT Context Name: Admin Context Id: 0 Running Cfg Sync Status: Successful FT Group ID: 2 My State:FSM_FT_STATE_STANDBY_HOT Peer State:FSM_FT_STATE_ACTIVE Context Name: C1 Context Id: 1 Running Cfg Sync Status: Successful axsl02/Admin(config)# end axsl02/Admin# ft switchover 1 This command will cause card to switchover (yes/no)? [no] yes NOTE: Configuration mode has been disabled on all sessions
Cisco way
- Go to the context you want to fix on the active ACE, disable running-config and startup-config synchronization
- Then enable them back.
Every time I’ve tried this, there has been indeed a resynchronization but the standby ACE ended back to standby cold state.
axsl01/C1# sh ft group brief FT Group ID: 2 My State:FSM_FT_STATE_ACTIVE Peer State:FSM_FT_STATE_STANDBY_COLD Context Name: C1 Context Id: 1 Running Cfg Sync Status: Successful axsl01/C1(config)# no ft auto-sync running-config axsl01/C1(config)# no ft auto-sync startup-config axsl01/C1(config)# ft auto-sync startup-config axsl01/C1(config)# ft auto-sync running-config NOTE: Configuration mode has been disabled on all sessions NOTE: Configuration mode is enabled on all sessions axsl01/C1(config)# do sh ft group brief FT Group ID: 2 My State:FSM_FT_STATE_ACTIVE Peer State:FSM_FT_STATE_STANDBY_COLD Context Name: C1 Context Id: 1 Running Cfg Sync Status: Successful
That’s all folks! Leave a comment if you liked, disliked, have additional inputs, queries, want to hire our services, want just say hello…