Cold ACE ?

ACE Aug 13, 2012

Ever seen an ACE in standby cold state? This means the standby ACE has not been able to synchronize properly with the active ACE. It usually happen when the standby ACE is missing some certificates, keys or script files referrenced by the active ACE. This usually happen after an RMA. In that state, the ACE won’t be able to perform a stateful failover and all the sessions would be lost should a failover occur.

axsl01/Admin# sh ft group status
FT Group                     : 1
Configured Status            : in-service
Maintenance mode             : MAINT_MODE_OFF
My State                     : FSM_FT_STATE_ACTIVE
Peer State                   : FSM_FT_STATE_STANDBY_HOT
Peer Id                      : 1
No. of Contexts              : 1
Running cfg sync status      : Running configuration sync has completed
Startup cfg sync status      : Startup configuration sync has completed

FT Group                     : 2
Configured Status            : in-service
Maintenance mode             : MAINT_MODE_OFF
My State                     : FSM_FT_STATE_ACTIVE
Peer State                   : FSM_FT_STATE_STANDBY_COLD
Peer Id                      : 1
No. of Contexts              : 1
Running cfg sync status      : Peer in Cold State. Incremental Sync Failure: script file not configured

Startup cfg sync status      : Peer in Cold State. Incremental Sync Failure: script file not configured

First step is, of course, to copy the missing files to the standby ACE. Script files have to be copied via tftp, certificates and keys may be copied via tftp or terminal copy/paste. In newer firmware releases you can easily identified what caused the ACE to go into that cold state:

axsl02/C1# sh ft config-error
Mon Aug 13 09:19:18 UTC 2012

`script file 2 test.tcl`
Error: Unable to locate the script file 'test.tcl'

Once the files are copied, you have to bring back the standby ACE to the standby hot state. As of now, I know four ways to do that. Actually only three are working, the fourth one (provided by Cisco) never worked for me.

Hard way

  1. Reboot the standby ACE.
axsl01/Admin# reload
This command will reboot the system
Save configurations for all the contexts. Save? [yes/no]: [yes] yes
Perform system reload. [yes/no]: [yes] yes

In this case, you have to ensure all the contexts are active or at least in standby hot state on the other ACE. It takes a “very” long time (ACE boot time is +/- 10 minutes) before you can validate it worked.

Quick and dirty way – aka I’m feeling lucky today

  1. Go to the Admin context on the active ACE.
  2. Quickly deactivate/activate the ft group of the context you want to fix.
axsl01/Admin# conf t
Enter configuration commands, one per line.  End with CNTL/Z.
axsl01/Admin(config)# ft group 2
axsl01/Admin(config-ft-group)# no inservice
axsl01/Admin(config-ft-group)# inservice
axsl01/Admin(config-ft-group)# end
axsl01/Admin# sh ft group 2 status

FT Group                     : 2
Configured Status            : in-service
Maintenance mode             : MAINT_MODE_OFF
My State                     : FSM_FT_STATE_ACTIVE
Peer State                   : FSM_FT_STATE_STANDBY_BULK
Peer Id                      : 1
No. of Contexts              : 1
Running cfg sync status      : Running configuration sync has completed
Startup cfg sync status      : Startup configuration sync has completed
axsl01/Admin# sh ft group 2 status             (after some time)

FT Group                     : 2
Configured Status            : in-service
Maintenance mode             : MAINT_MODE_OFF
My State                     : FSM_FT_STATE_ACTIVE
Peer State                   : FSM_FT_STATE_STANDBY_HOT
Peer Id                      : 1
No. of Contexts              : 1
Running cfg sync status      : Running configuration sync has completed
Startup cfg sync status      : Startup configuration sync has completed

If you’re quick enough, the active ACE will keep serving the active sessions, only the new sessions will be refused during the period of time the context is not in service, few ms if you’re quick enough.

This method may still be unacceptable in some case and might be catastrophic if you (or your terminal client) are not quick enough.

Cautious way – aka exp-Networks way

  1. Make the Admin context active on the standby ACE, perform a switchover of the Admin context if required.
  2. Disable the running-config synchronization for the Admin context.
  3. Deactivate the ft group of the context you want to fix. Since you’ve disabled the running-config synchronization, the command only take effect locally, in other words the active context on the other ACE remains active.
  4. Activate the ft group again and wait until the context stabilize in standby hot state.
  5. Enable the running-config synchronization for the Admin context again.
  6. Perform a switchover of the Admin context if required.
axsl02/Admin# sh ft group brief

FT Group ID: 1  My State:FSM_FT_STATE_STANDBY_HOT       
                Peer State:FSM_FT_STATE_ACTIVE
                Context Name: Admin
                Context Id: 0   Running Cfg Sync Status: Successful
                
FT Group ID: 2  My State:FSM_FT_STATE_STANDBY_COLD
                Peer State:FSM_FT_STATE_ACTIVE
                Context Name: C1
                Context Id: 1   Running Cfg Sync Status: Successful

axsl02/Admin# ft switchover 1
This command will cause card to switchover (yes/no)?  [no] yes
axsl02/Admin#

NOTE: Configuration mode is enabled on all sessions
NOTE: Configuration mode has been disabled on all sessions
NOTE: Configuration mode is enabled on all sessions

axsl02/Admin# sh ft group brief

FT Group ID: 1  My State:FSM_FT_STATE_ACTIVE
                Peer State:FSM_FT_STATE_STANDBY_HOT
                Context Name: Admin
                Context Id: 0   Running Cfg Sync Status: Successful

FT Group ID: 2  My State:FSM_FT_STATE_STANDBY_COLD
                Peer State:FSM_FT_STATE_ACTIVE
                Context Name: C1
                Context Id: 1   Running Cfg Sync Status: Successful

axsl02/Admin# conf t
Enter configuration commands, one per line.  End with CNTL/Z.
axsl02/Admin(config)# no ft auto-sync running-config
axsl02/Admin(config)# ft group 2
axsl02/Admin(config-ft-group)# no inservice
axsl02/Admin(config-ft-group)# do sh ft group brief

FT Group ID: 1  My State:FSM_FT_STATE_ACTIVE
                Peer State:FSM_FT_STATE_STANDBY_HOT
                Context Name: Admin
                Context Id: 0   Running Cfg Sync Status: Successful

FT Group ID: 2  My State:FSM_FT_STATE_INIT
                Peer State:FSM_FT_STATE_UNKNOWN
                Context Name: C1
                Context Id: 1   Running Cfg Sync Status: Successful

We can check on the other ACE that context C1 is still active.

axsl01/Admin# sh ft group brief

FT Group ID: 1  My State:FSM_FT_STATE_STANDBY_HOT
                Peer State:FSM_FT_STATE_ACTIVE
                Context Name: Admin
                Context Id: 0   Running Cfg Sync Status: Successful

FT Group ID: 2  My State:FSM_FT_STATE_ACTIVE
                Peer State:FSM_FT_STATE_INIT
                Context Name: C1
                Context Id: 1   Running Cfg Sync Status: Successful

Back on the other ACE to bring the context back into service.

axsl02/Admin(config-ft-group)# inservice
axsl02/Admin(config-ft-group)# ft auto-sync running-config
axsl02/Admin(config)#

NOTE: Configuration mode has been disabled on all sessions
NOTE: Configuration mode is enabled on all sessions

axsl02/Admin(config)# do sh ft group brief

FT Group ID: 1  My State:FSM_FT_STATE_ACTIVE
                Peer State:FSM_FT_STATE_STANDBY_HOT
                Context Name: Admin
                Context Id: 0   Running Cfg Sync Status: Successful

FT Group ID: 2  My State:FSM_FT_STATE_STANDBY_HOT
                Peer State:FSM_FT_STATE_ACTIVE
                Context Name: C1
                Context Id: 1   Running Cfg Sync Status: Successful

axsl02/Admin(config)# end
axsl02/Admin# ft switchover 1
This command will cause card to switchover (yes/no)?  [no] yes

NOTE: Configuration mode has been disabled on all sessions

Cisco way

  1. Go to the context you want to fix on the active ACE, disable running-config and startup-config synchronization
  2. Then enable them back.

Every time I’ve tried this, there has been indeed a resynchronization but the standby ACE ended back to standby cold state.

axsl01/C1# sh ft group brief

FT Group ID: 2  My State:FSM_FT_STATE_ACTIVE
                Peer State:FSM_FT_STATE_STANDBY_COLD
                Context Name: C1
                Context Id: 1   Running Cfg Sync Status: Successful

axsl01/C1(config)# no ft auto-sync running-config
axsl01/C1(config)# no ft auto-sync startup-config
axsl01/C1(config)# ft auto-sync startup-config
axsl01/C1(config)# ft auto-sync running-config

NOTE: Configuration mode has been disabled on all sessions
NOTE: Configuration mode is enabled on all sessions

axsl01/C1(config)# do sh ft group brief

FT Group ID: 2  My State:FSM_FT_STATE_ACTIVE
                Peer State:FSM_FT_STATE_STANDBY_COLD
                Context Name: C1
                Context Id: 1   Running Cfg Sync Status: Successful

That’s all folks! Leave a comment if you liked, disliked, have additional inputs, queries, want to hire our services, want just say hello…

Tags

Christophe Lemaire

Christophe is network and security engineer for more than 20 years. He has always been eager to learn new technologies and to share them with his peers. He's always happy to help, so don't hesitate...

Great! You've successfully subscribed.
Great! Next, complete checkout for full access.
Welcome back! You've successfully signed in.
Success! Your account is fully activated, you now have access to all content.