Monthly Archives: December 2013

How to “kick” an Avaya Local Survivable Processor without rebooting

I recently had an Avaya Local Survivable Processor (LSP) unregister. There was no indication why – it just went down. I was able to get to its web interface, and it had a minor alarm indicating that it was unregistered. I suppose I could have restarted the whole thing, but my tech talked me through this procedure, which worked great.

Please note that the LSP is typically registered and inactive. This LSP only activates if network connectivity is lost to the site. So if the home site is available through the WAN, it should look like this:

list survivable-processor
                             SURVIVABLE PROCESSORS
 Record Name/            Type         Reg Act         Translations      Net
 Number  IP Address                                   Updated           Rgn
  1     THE_LSP          LSP           y   n          2:02 12/30/2013   10
         10.2.10.110
        No V6 Entry

However, in my case it looked like this!

list survivable-processor
                             SURVIVABLE PROCESSORS
 Record Name/            Type         Reg Act         Translations      Net
 Number  IP Address                                   Updated           Rgn
  1     THE_LSP          LSP           n                                10
         10.2.10.110
        No V6 Entry

No entry to indicate the active status nor the last translation date. I was able to SSH to the LSP and I confirmed the LSP wasn’t active. The media gateway was still connected to my home site. The LSP could ping the home site. Somehow, the LSP was simply lost.

So I confirmed all CM processes were up with the “statapp” command:

telecom@pbx-cm01> statapp
Watchdog         9/ 9 UP SIMPLEX
TraceLogger      3/ 3 UP SIMPLEX
LicenseServer    2/ 2 UP SIMPLEX
SME              6/ 6 UP SIMPLEX
MasterAgent      1/ 1 UP SIMPLEX
MIB2Agent        1/ 1 UP SIMPLEX
MVSubAgent       1/ 1 UP SIMPLEX
LoadAgent        1/ 1 UP SIMPLEX
FPAgent          1/ 1 UP SIMPLEX
INADSAlarmAgent  1/ 1 UP SIMPLEX
GMM              4/ 4 UP SIMPLEX
SNMPManager      1/ 1 UP SIMPLEX
filesyncd        1/ 1 UP SIMPLEX
MCD              1/ 1 UP SIMPLEX
CommunicaMgr    86/86 UP SIMPLEX
telecom@pbx-cm01>

Sure enough, all processes were running fine. So I stopped CM with the command

stop -afcn

This will stop the processes listed above, refreshing as it goes. It’s pretty nice. When all processes have stopped, you can start CM again with

start -ac

Again, it refreshes the screen as the processes start back up. When it’s done, the LSP should re-register. It’s also a good idea to watch the home CM for this register event. About 30 seconds after the CM comes back, you should see it. In your home site, do a

list trace ras ip-a 10.2.10.110

while the LSP processes restarts.

Here is the register event

list trace ras ip-address 10.2.10.110

                                LIST TRACE

time            data

17:45:39 TRACE STARTED 12/30/2013 CM Release String cold-03.0.124.0-21166
17:46:37   rcv RRQ ext
                   endpt  [10.2.10.110]:1719
                   switch [10.1.10.110]:1719
17:46:52   rcv KARRQ ext
                   endpt  [10.2.10.110]:1719
                   switch [10.1.10.110]:1719
17:47:07   rcv KARRQ ext
                   endpt  [10.2.10.110]:1719
                   switch [10.1.10.110]:1719
17:47:20   RAS TRACE COMPLETE ext

Those KARRQ messages are the keepalives.

After this restart, the LSP was registered fine. No service impact at the site, as the LSP is inactive unless the WAN goes down. And much less scary (for me anyway) than restarting the whole LSP.

Happy New Year everyone!