Cold ACE ?
Ever seen an ACE in standby cold state? This means the standby ACE has not been able to synchronize properly with the active ACE. It usually happen when the standby ACE is missing some certificates, keys or script files referrenced by the active ACE. This usually happen after an RMA. In that state, the ACE won’t be able to perform a stateful failover and all the sessions would be lost should a failover occur.
axsl01/Admin# sh ft group status FT Group : 1 Configured Status : in-service Maintenance mode : MAINT_MODE_OFF My State : FSM_FT_STATE_ACTIVE Peer State : FSM_FT_STATE_STANDBY_HOT Peer Id : 1 No. of Contexts : 1 Running cfg sync status : Running configuration sync has completed Startup cfg sync status : Startup configuration sync has completed FT Group : 2 Configured Status : in-service Maintenance mode : MAINT_MODE_OFF My State : FSM_FT_STATE_ACTIVE Peer State : FSM_FT_STATE_STANDBY_COLD Peer Id : 1 No. of Contexts : 1 Running cfg sync status : Peer in Cold State. Incremental Sync Failure: script file not configured Startup cfg sync status : Peer in Cold State. Incremental Sync Failure: script file not configured
First step is, of course, to copy the missing files to the standby ACE. Script files have to be copied via tftp, certificates and keys may be copied via tftp or terminal copy/paste. In newer firmware releases you can easily identified what caused the ACE to go into that cold state:
axsl02/C1# sh ft config-error Mon Aug 13 09:19:18 UTC 2012 `script file 2 test.tcl` Error: Unable to locate the script file 'test.tcl'
Once the files are copied, you have to bring back the standby ACE to the standby hot state. As of now, I know four ways to do that. Actually only three are working, the fourth one (provided by Cisco) never worked for me.
Hard way
- Reboot the standby ACE.
axsl01/Admin# reload This command will reboot the system Save configurations for all the contexts. Save? [yes/no]: [yes] yes Perform system reload. [yes/no]: [yes] yes
In this case, you have to ensure all the contexts are active or at least in standby hot state on the other ACE. It takes a “very” long time (ACE boot time is +/- 10 minutes) before you can validate it worked.
Quick and dirty way – aka I’m feeling lucky today
- Go to the Admin context on the active ACE.
- Quickly deactivate/activate the ft group of the context you want to fix.
axsl01/Admin# conf t Enter configuration commands, one per line. End with CNTL/Z. axsl01/Admin(config)# ft group 2 axsl01/Admin(config-ft-group)# no inservice axsl01/Admin(config-ft-group)# inservice axsl01/Admin(config-ft-group)# end axsl01/Admin# sh ft group 2 status FT Group : 2 Configured Status : in-service Maintenance mode : MAINT_MODE_OFF My State : FSM_FT_STATE_ACTIVE Peer State : FSM_FT_STATE_STANDBY_BULK Peer Id : 1 No. of Contexts : 1 Running cfg sync status : Running configuration sync has completed Startup cfg sync status : Startup configuration sync has completed axsl01/Admin# sh ft group 2 status (after some time) FT Group : 2 Configured Status : in-service Maintenance mode : MAINT_MODE_OFF My State : FSM_FT_STATE_ACTIVE Peer State : FSM_FT_STATE_STANDBY_HOT Peer Id : 1 No. of Contexts : 1 Running cfg sync status : Running configuration sync has completed Startup cfg sync status : Startup configuration sync has completed
If you’re quick enough, the active ACE will keep serving the active sessions, only the new sessions will be refused during the period of time the context is not in service, few ms if you’re quick enough.
This method may still be unacceptable in some case and might be catastrophic if you (or your terminal client) are not quick enough.
Cautious way – aka exp-Networks way
- Make the Admin context active on the standby ACE, perform a switchover of the Admin context if required.
- Disable the running-config synchronization for the Admin context.
- Deactivate the ft group of the context you want to fix. Since you’ve disabled the running-config synchronization, the command only take effect locally, in other words the active context on the other ACE remains active.
- Activate the ft group again and wait until the context stabilize in standby hot state.
- Enable the running-config synchronization for the Admin context again.
- Perform a switchover of the Admin context if required.
axsl02/Admin# sh ft group brief
FT Group ID: 1 My State:FSM_FT_STATE_STANDBY_HOT
Peer State:FSM_FT_STATE_ACTIVE
Context Name: Admin
Context Id: 0 Running Cfg Sync Status: Successful
FT Group ID: 2 My State:FSM_FT_STATE_STANDBY_COLD
Peer State:FSM_FT_STATE_ACTIVE
Context Name: C1
Context Id: 1 Running Cfg Sync Status: Successful
axsl02/Admin# ft switchover 1
This command will cause card to switchover (yes/no)? [no] yes
axsl02/Admin#
NOTE: Configuration mode is enabled on all sessions
NOTE: Configuration mode has been disabled on all sessions
NOTE: Configuration mode is enabled on all sessions
axsl02/Admin# sh ft group brief
FT Group ID: 1 My State:FSM_FT_STATE_ACTIVE
Peer State:FSM_FT_STATE_STANDBY_HOT
Context Name: Admin
Context Id: 0 Running Cfg Sync Status: Successful
FT Group ID: 2 My State:FSM_FT_STATE_STANDBY_COLD
Peer State:FSM_FT_STATE_ACTIVE
Context Name: C1
Context Id: 1 Running Cfg Sync Status: Successful
axsl02/Admin# conf t
Enter configuration commands, one per line. End with CNTL/Z.
axsl02/Admin(config)# no ft auto-sync running-config
axsl02/Admin(config)# ft group 2
axsl02/Admin(config-ft-group)# no inservice
axsl02/Admin(config-ft-group)# do sh ft group brief
FT Group ID: 1 My State:FSM_FT_STATE_ACTIVE
Peer State:FSM_FT_STATE_STANDBY_HOT
Context Name: Admin
Context Id: 0 Running Cfg Sync Status: Successful
FT Group ID: 2 My State:FSM_FT_STATE_INIT
Peer State:FSM_FT_STATE_UNKNOWN
Context Name: C1
Context Id: 1 Running Cfg Sync Status: SuccessfulWe can check on the other ACE that context C1 is still active.
axsl01/Admin# sh ft group brief
FT Group ID: 1 My State:FSM_FT_STATE_STANDBY_HOT
Peer State:FSM_FT_STATE_ACTIVE
Context Name: Admin
Context Id: 0 Running Cfg Sync Status: Successful
FT Group ID: 2 My State:FSM_FT_STATE_ACTIVE
Peer State:FSM_FT_STATE_INIT
Context Name: C1
Context Id: 1 Running Cfg Sync Status: SuccessfulBack on the other ACE to bring the context back into service.
axsl02/Admin(config-ft-group)# inservice
axsl02/Admin(config-ft-group)# ft auto-sync running-config
axsl02/Admin(config)#
NOTE: Configuration mode has been disabled on all sessions
NOTE: Configuration mode is enabled on all sessions
axsl02/Admin(config)# do sh ft group brief
FT Group ID: 1 My State:FSM_FT_STATE_ACTIVE
Peer State:FSM_FT_STATE_STANDBY_HOT
Context Name: Admin
Context Id: 0 Running Cfg Sync Status: Successful
FT Group ID: 2 My State:FSM_FT_STATE_STANDBY_HOT
Peer State:FSM_FT_STATE_ACTIVE
Context Name: C1
Context Id: 1 Running Cfg Sync Status: Successful
axsl02/Admin(config)# end
axsl02/Admin# ft switchover 1
This command will cause card to switchover (yes/no)? [no] yes
NOTE: Configuration mode has been disabled on all sessionsCisco way
- Go to the context you want to fix on the active ACE, disable running-config and startup-config synchronization
- Then enable them back.
Every time I’ve tried this, there has been indeed a resynchronization but the standby ACE ended back to standby cold state.
axsl01/C1# sh ft group brief
FT Group ID: 2 My State:FSM_FT_STATE_ACTIVE
Peer State:FSM_FT_STATE_STANDBY_COLD
Context Name: C1
Context Id: 1 Running Cfg Sync Status: Successful
axsl01/C1(config)# no ft auto-sync running-config
axsl01/C1(config)# no ft auto-sync startup-config
axsl01/C1(config)# ft auto-sync startup-config
axsl01/C1(config)# ft auto-sync running-config
NOTE: Configuration mode has been disabled on all sessions
NOTE: Configuration mode is enabled on all sessions
axsl01/C1(config)# do sh ft group brief
FT Group ID: 2 My State:FSM_FT_STATE_ACTIVE
Peer State:FSM_FT_STATE_STANDBY_COLD
Context Name: C1
Context Id: 1 Running Cfg Sync Status: SuccessfulThat’s all folks! Leave a comment if you liked, disliked, have additional inputs, queries, want to hire our services, want just say hello…