ACE software upgrade

ACE Feb 10, 2010

Cisco Application Control Engine Module (ACE) load-balancers are designed to work in standalone mode or in cluster mode. When running in standalone mode, software upgrade has obviously a great impact on the traffic going through the load-balancer. All the sessions will be dropped and no new session will be accepted until the ACE restarts with the new image (up to 8 minutes).

Now, in cluster mode, you can do the software upgrade with no or very limited impact if you follow the correct sequence of operations. Here are the steps I used last time and it went perfectly and transparent for the users.

Note this procedure has been tested on ACE modules for Catalyst 6500 only but it should stay valid for the ACE 4710 appliances.

Step 1

First you need to make sure all the contexts are properly synchronized and the standby contexts are in STANDBY_HOT state.

ACE_1/Admin# sh ft group brief
FT Group ID: 1  My State:FSM_FT_STATE_ACTIVE
                Peer State:FSM_FT_STATE_STANDBY_HOT
                Context Name: Admin     Context Id: 0
FT Group ID: 2  My State:FSM_FT_STATE_ACTIVE
                Peer State:FSM_FT_STATE_STANDBY_COLD
                Context Name: C1        Context Id: 4
FT Group ID: 3  My State:FSM_FT_STATE_ACTIVE
                Peer State:FSM_FT_STATE_STANDBY_HOT
                Context Name: C2        Context Id: 3

Here as you can see context C1 is stuck in STANDBY_COLD state. Usually put that context out of service on the standby ACE and then put it back in service solve the issue. If it is not the case you won’t have a fully transparent software upgrade for that context; current session will be dropped but new session will be accepted after the failover. If it is acceptable for you, go on with the upgrade otherwise try to find out why it is not in STANDBY_HOT state.

Note it might take several minutes to leave the STANDBY_BULK state (it took 2 minutes during my tests).

ACE_2/Admin(config)# ft group 2
ACE_2/Admin(config-ft-group)# no inservice
ACE_2/Admin(config-ft-group)# do sh ft group 2 detail

FT Group                     : 2
No. of Contexts              : 1
Context Name                 : C1
Context Id                   : 4
Configured Status            : out-of-service
Maintenance mode             : MAINT_MODE_OFF
My State                     : FSM_FT_STATE_INIT
My Config Priority           : 90
My Net Priority              : 90
My Preempt                   : Enabled
Peer State                   : FSM_FT_STATE_UNKNOWN
Peer Config Priority         : Unknown
Peer Net Priority            : Unknown
Peer Preempt                 : Unknown
Peer Id                      : 1
Last State Change time       : Wed Feb  3 14:35:36 2010
Running cfg sync enabled     : Enabled
Running cfg sync status      :
Startup cfg sync enabled     : Enabled
Startup cfg sync status      :
Bulk sync done for ARP: 0
Bulk sync done for LB: 0
Bulk sync done for ICM: 0
ACE_2/Admin(config-ft-group)# inservice

NOTE: Configuration mode has been disabled on all sessions

ACE_2/Admin(config-ft-group)# do sh ft group 2 detail

FT Group                     : 2
No. of Contexts              : 1
Context Name                 : C1
Context Id                   : 4
Configured Status            : in-service
Maintenance mode             : MAINT_MODE_OFF
My State                     : FSM_FT_STATE_STANDBY_BULK
My Config Priority           : 90
My Net Priority              : 90
My Preempt                   : Enabled
Peer State                   : FSM_FT_STATE_ACTIVE
Peer Config Priority         : 120
Peer Net Priority            : 120
Peer Preempt                 : Enabled
Peer Id                      : 1
Last State Change time       : Wed Feb  3 14:36:02 2010
Running cfg sync enabled     : Enabled
Running cfg sync status      : Running configuration sync has completed
Startup cfg sync enabled     : Enabled
Startup cfg sync status      : Startup configuration sync has completed
Bulk sync done for ARP: 1
Bulk sync done for LB: 0
Bulk sync done for ICM: 0
ACE_2/Admin(config-ft-group)# do sh ft group 1 detail

FT Group                     : 2
No. of Contexts              : 1
Context Name                 : C1
Context Id                   : 4
Configured Status            : in-service
Maintenance mode             : MAINT_MODE_OFF
My State                     : FSM_FT_STATE_STANDBY_HOT
My Config Priority           : 90
My Net Priority              : 90
My Preempt                   : Enabled
Peer State                   : FSM_FT_STATE_ACTIVE
Peer Config Priority         : 120
Peer Net Priority            : 120
Peer Preempt                 : Enabled
Peer Id                      : 1
Last State Change time       : Wed Feb  3 14:37:51 2010
Running cfg sync enabled     : Enabled
Running cfg sync status      : Running configuration sync has completed
Startup cfg sync enabled     : Enabled
Startup cfg sync status      : Startup configuration sync has completed
Bulk sync done for ARP: 1
Bulk sync done for LB: 2
Bulk sync done for ICM: 2

Step 2

On the ACE, preemption is enabled by default for all the  contexts. It needs to be disabled to perform a manual failover.

ACE_1/Admin(config)# ft group 1
ACE_1/Admin(config-ft-group)# no preempt
ACE_1/Admin(config-ft-group)# ft group 2
ACE_1/Admin(config-ft-group)# no preempt
ACE_1/Admin(config-ft-group)# ft group 3
ACE_1/Admin(config-ft-group)# no preempt
ACE_1/Admin(config-ft-group)# end

Step 3

Download the new software image to the active and standby ACEs. Here I’ve chosen to use tftp because I hadn’t a ftp server configured in the lab… ftp can be used and is definitely faster.

ACE_1/Admin# copy tftp: image:
Enter source filename[]? c6ace-t1k9-mz.A2_2_3.bin
Enter the destination filename[]? [c6ace-t1k9-mz.A2_2_3.bin]
Address of remote host[]? 10.1.1.1
Trying to connect to tftp server......
!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
(…)
!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
TFTP get operation was successful
 31361516 bytes copied
ACE_1/Admin#
ACE_1/Admin# dir image:
 30788103  Apr 15 13:14:48 2009 c6ace-t1k9-mz.A2_1_4a.bin
 31361516  Feb  3 14:43:45 2010 c6ace-t1k9-mz.A2_2_3.bin
          Usage for image: filesystem
          461848576 bytes total used
          577126400 bytes free
         1038974976 total bytes

Check the file size is correct…

Step 4

Change the boot string on the active ACE, it will be synced to the standby ACE. By the way, configuration mode is disabled on the standby ACE therefore it is the only option…

ACE_1/Admin# sh run | i boot
Generating configuration....
boot system image:c6ace-t1k9-mz.A2_1_4a.bin
ACE_1/Admin# conf t
Enter configuration commands, one per line.  End with CNTL/Z.
ACE_1/Admin(config)# no boot system image:c6ace-t1k9-mz.A2_1_4a.bin
ACE_1/Admin(config)# boot system image:c6ace-t1k9-mz.A2_2_3.bin
ACE_1/Admin(config)# exit
ACE_1/Admin# wr mem all
Generating configuration....
running config of context Admin saved
Generating configuration....
running config of context C2 saved
Generating configuration....
running config of context C1 saved
Please wait ... sync to compact flash in progress.

This may take a few minutes to complete

Sync Done

Step 5 (optional)

Create checkpoint in all contexts on active and standby devices

ACE_2/Admin# checkpoint create 20100203
Generating configuration....
Created configuration checkpoint '20100203'
ACE_2/Admin# changeto C2

NOTE: Configuration mode has been disabled on all sessions

ACE_2/C2# checkpoint create 20100203
Generating configuration....
Created configuration checkpoint '20100203'
ACE_2/C2# changeto C1

NOTE: Configuration mode has been disabled on all sessions

ACE_2/C1# checkpoint create 20100203
Generating configuration....
Created configuration checkpoint '20100203'
ACE_2/C1# changeto Admin

Step 6

Reload the standby device

ACE_2/Admin# reload
This command will reboot the system
Save configurations for all the contexts. Save? [yes/no]: [yes] no 
(already done in step 4)
Perform system reload. [yes/no]: [yes]

NOTE: Configuration mode is enabled on all sessions

Connection to ACE_2 closed by remote host.
Connection to ACE_2 closed.

Step 7

Check the standby device is running the new software version.

ACE_2/Admin# sh ver
Cisco Application Control Software (ACSW)
TAC support: http://C2 .cisco.com/tac
Copyright (c) 2002-2009, Cisco Systems, Inc. All rights reserved.
The copyrights to certain works contained herein are owned by
other third parties and are used and distributed under license.
Some parts of this software are covered under the GNU Public
License. A copy of the license is available at
http://C2 .gnu.org/licenses/gpl.html.

Software
  loader:    Version 12.2[120]
  system:    Version A2(2.3) [build 3.0(0)A2(2.3)]
  system image file: [LCP] disk0:c6ace-t1k9-mz.A2_2_3.bin
  installed license: ACE-VIRT-020

Hardware
  Cisco ACE (slot: 6)
  cpu info:
    number of cpu(s): 2
    cpu type: SiByte
    cpu: 0, model: SiByte SB1 V0.2, speed: 700 MHz
    cpu: 1, model: SiByte SB1 V0.2, speed: 700 MHz
  memory info:
    total: 827128 kB, free: 256000 kB
    shared: 0 kB, buffers: 1824 kB, cached 0 kB
  cf info:
    filesystem: /dev/cf
    total: 1014624 kB, used: 451040 kB, available: 563584 kB

last boot reason:  reload command by Admin
configuration register:  0x1
ACE_2 kernel uptime is 0 days 0 hour 8 minute(s) 45 second(s)

Step 8

Wait until all the contexts on the standby devices stabilize in STANDBY_WARM or STANDBY_HOT state.

ACE_2/Admin# sh ft group brief

FT Group ID: 1  My State:FSM_FT_STATE_STANDBY_WARM
                Peer State:FSM_FT_STATE_ACTIVE
                Context Name: Admin     Context Id: 0
FT Group ID: 2  My State:FSM_FT_STATE_STANDBY_WARM
                Peer State:FSM_FT_STATE_ACTIVE
                Context Name: C1        Context Id: 4
FT Group ID: 3  My State:FSM_FT_STATE_STANDBY_WARM
                Peer State:FSM_FT_STATE_ACTIVE
                Context Name: C2        Context Id: 3

For your information, here is what Cisco says about STANDBY_WARM state :

In the STANDBY_WARM state, as with the STANDBY_HOT state, configuration mode is disabled on the standby ACE and configuration and state synchronization continues. A failover from the active to the standby based on priorities and preempt can still occur while the standby is in the STANDBY_WARM state. However, while stateful failover is possible for a WARM standby, it is not guaranteed. In general, modules should be allowed to remain in this state only for a short period.

Step 9

Perform a failover from the active ACE to the standby ACE for all the contexts.

ACE_1/Admin# ft switchover all
This command will cause card to switchover (yes/no)?  [no] yes

NOTE: Configuration mode has been disabled on all sessions

Step 10

Check the newly upgraded ACE is well become active.

ACE_1/Admin# sh ft group brief

FT Group ID: 1  My State:FSM_FT_STATE_STANDBY_BULK
                Peer State:FSM_FT_STATE_ACTIVE
                Context Name: Admin     Context Id: 0
FT Group ID: 2  My State:FSM_FT_STATE_STANDBY_BULK
                Peer State:FSM_FT_STATE_ACTIVE
                Context Name: C1        Context Id: 4
FT Group ID: 3  My State:FSM_FT_STATE_STANDBY_BULK
                Peer State:FSM_FT_STATE_ACTIVE
                Context Name: C2        Context Id: 3

Step 11

Reload the 2nd ACE (previously active).

ACE_1/Admin# reload
This command will reboot the system
Save configurations for all the contexts. Save? [yes/no]: [yes] no
Perform system reload. [yes/no]: [yes]

NOTE: Configuration mode is enabled on all sessions

Connection to ACE_1 closed by remote host.
Connection to ACE_1 closed.

Step 12

When the 2nd ACE state stabilize to FSM_FT_STATE_STANDBY_HOT state, perform again a failover for all the contexts.

ACE_2/Admin# sh ft group brief

FT Group ID: 1  My State:FSM_FT_STATE_ACTIVE
                Peer State:FSM_FT_STATE_STANDBY_HOT
                Context Name: Admin     Context Id: 0
FT Group ID: 2  My State:FSM_FT_STATE_ACTIVE
                Peer State:FSM_FT_STATE_STANDBY_HOT
                Context Name: C1        Context Id: 4
FT Group ID: 3  My State:FSM_FT_STATE_ACTIVE
                Peer State:FSM_FT_STATE_STANDBY_HOT
                Context Name: C2        Context Id: 3

Step 13 (If you’re not superstitious)

Reconfigure preemption if it is in your standard… (personally I don’t like preemption because if a device has failed I prefer to check exactly why before activating it again)

ACE_1/Admin(config)# ft group 1
ACE_1/Admin(config-ft-group)# preempt
ACE_1/Admin(config-ft-group)# ft group 2
ACE_1/Admin(config-ft-group)# preempt
ACE_1/Admin(config-ft-group)# ft group 3
ACE_1/Admin(config-ft-group)# preempt
ACE_1/Admin(config-ft-group)# end
ACE_1/Admin# wr mem

And that’s it, you have upgraded your ACE cluster with no or limited impact. If you find this post helpful you may leave a comment to encourage me to publish more articles…

Tags

Christophe Lemaire

Christophe is network and security engineer for more than 20 years. He has always been eager to learn new technologies and to share them with his peers. He's always happy to help, so don't hesitate...

Great! You've successfully subscribed.
Great! Next, complete checkout for full access.
Welcome back! You've successfully signed in.
Success! Your account is fully activated, you now have access to all content.