Category Archives: pbx

Avaya 9630 Locks / Reboots up when registering

I recently had an Avaya phone in a reboot cycle. It would boot up, then when it registered it would lock up and after a couple minutes reboot again. The display looked like it was up and running fine, but when you press the SPEAKER button I would get three beeps. This is typically what happens when a phone cannot get TCP/IP signalling traffic to the call server. And the night before we had some maintenance on that Ethernet switch so I immediately suspected a network problem.

Just in case, I did a “CLEAR” procedure on the phone. Then I swapped out the phone. Then I swapped out the patch cables (at both ends). Then I moved the phone to a different port in the Ethernet switch. No matter what I did, the phone locked up. Then I tried something I probably should have tried earlier – I logged in a different extension and it worked fine! Then I logged the “bad” extension into a different phone and it locked up!

Turns out the config file on the web server (1234_96xxdata.txt) was incomplete. Apparently it was related to the network after all! When the phone was writing its data file to the web server the previous night, the write operation was interrupted as the network guy shut off the Ethernet switch. The resulting data file was incomplete – it had about half the call log entries and a partial line at the end. But none of the important lines you’d expect in the file such as:

LOGTDFORMAT=128
Redial=1
Edit Dialing=1
Go to Phone Screen on Calling=0
Go to Phone Screen on Ringing=1
Call Timer=1
Visual Alerting=0
History Active=1
Log Bridged Calls=1
Audio Path=1
Personalized Ring=0
Handset AGC=1
Headset AGC=1
Speaker AGC=1
Error Tone=1
Button Clicks=0
Text Size=1
Contacts Pairing=0
Voice Initiated Dialing=1
Voice Dialing Help Counter=0
Personalized Ring Menu=0
Go to Phone Screen on Answer=0
Voice Initiated Dialing Language=

If the file were missing, the phone would use default values and create the file at the next backup. However, since the file was there, the phone processed it but ended up locking up because it was incomplete. In all my years working with these phones, I’ve never seen that before. I wouldn’t have thought it possible for the phone’s interrupted “HTTP PUT” operation to result in an incomplete file on the web server, but there you go. Hopefully this helps you.

I recently went through an ordeal with a PBX resetting. It’s an Avaya system using an IPSI to connect a port network back to its host, but this situation applies to anyone out there using QOS on their MPLS network. I’ve often said that being a “phone guy” is rarely about phones anymore. Most of my work – certainly troubleshooting – involves IP networking.

So I had a PBX with one IPSI that would occasionally reset. Since there was only one IPSI, the reset would cause all cards in the port network to reset as well, which would drop all calls in progress. Now this is about the worst thing that can happen when you’re responsible for the telephones. Full system outages are easier to understand. This is a reset, calls drop, users get frustrated and re-establish their calls, then it would reset again. It was a really bad situation.

What is causing the resets? Avaya said the heartbeats were failing to the IPSI. For any of you with an IPSI-connected port network, you should occasionally look for these. SSH to your Communication Manager and cd to /var/log/ecs. Then list the log files. Assuming you’re in Feburary 2013, you would look for missed heartbeats in your ecs log with the command:

cat 2013-02*.log|grep checkSlot
:pcd(5561):MED:[[3:0] checkSlot: sanity failure (1)]
:pcd(5561):MED:[[3:0] checkSlot: sanity failure (2)]
:pcd(5561):MED:[[3:0] checkSlot: sanity failure (3)]
:pcd(5561):MED:[[3:0] checkSlot: sanity failure (4)]
:pcd(5561):MED:[[3:0] checkSlot: data received replacing sanity message; socket delay is 14 secs]

I have stripped the date/time; you’ll see those on the left. Port networks and IPSIs are zero indexed, so the messages above apply to port network 4 and IPSI number 1.

I have been told that occasional sanity failures are just a part of life. These heartbeat messages are part of the Avaya protocol, not ICMP. So if you’re missing heartbeats, it’s not because ICMP is being dropped.

However, after a certain number of sanity failures, the IPSI will reset in order to re-esablish communication. How many sanity failures? That depends upon a system parameter setting:

display system-parameters ipserver-interface
IP SERVER INTERFACE (IPSI) SYSTEM PARAMETERS

SERVER INFORMATION
Primary Control Subnet Address:
Secondary Control Subnet Address:

OPTIONS

Switch Identifier: A
IPSI Control of Port Networks: enabled
A-side IPSI Preference: disabled
IPSI Socket Sanity Timeout: 15

QoS PARAMETERS
802.1p: 6
DiffServ: 46

The IPSI Socket Sanity Timeout determines how many sanity failures will cause an IPSI failover (if you have two in your port network), or a reset(!) if you only have one. The reset is the IPSI’s way of trying to re-establish communication. If you get too many sanity failures, you’ll get this message:

:pcd(5561):MED:[[3:0] checkSlot: too many sanity failures (15)]

Unfortunately, this means my CM lost connectivity to the first IPSI on port network 4. If I only have one IPSI, then the IPSI and all cards in the port network will reset. If I have a redundant IPSI, then the port network will failover and everything should be okay. In my particular case a second IPSI would not have helped me. It turns out, my MPLS carrier (who had also set up our edge routers) was policing the committed access rate. I’ll explain with more detail in my next post. The resolution was to shape the traffic rather than police it.

Recommendations to the new phone administrator

I’ve been working on telephone systems for a while. And I love my job. For the past few years, I find myself working with network administrators who have been handed the job of managing the telephone system. It makes sense – the PBX is just a big voice router, and nowadays the telephones are IP network endpoints.

But there’s more to managing a voice network than knowing the data network. I’m often asked by the new telecom admin “where should I start?” There’s a lot to know. And my biggest piece of advice is to Be the Authority. By this I mean you should be the person everyone asks about telephones. And you should usually start with the telephone and your voicemail system. The telephone is a complicated endpoint. Voicemail has a ton of features and an extremely limited user interface. For example, learn how to do the following:

  • Know how to transfer a call into voicemail without ringing the station.
  • Know how to conference two parties together. This includes two inbound calls. Also, learn the limits of conferencing. How many parties can conference together?
  • Can your users transfer calls outside the PBX (i.e. to mobile numbers)? If so, what happens if voicemail picks up at the far end. How do you pull that call back? What about when you attempt to conference rather than transfer?
  • Learn what all the feature buttons do, like park, call pickup, do-not-disturb, or any one of about 200 possible features.
  • Know how to program the speed dial buttons.
  • Keep a list of conference rooms and the speakerphone numbers handy.
  • Get to know you receptionists and find out what they need in a telephone system. They probably wish they had an accurate company directory, right? In a later post I’ll talk about how to provide this.
  • Spend time walking the floor and interacting with users. When someone calls for a simple change that can be performed remotely, go visit the user or at least give him or her a call. Try to chat about how they use the phone.
  • Learn how to create an out-of-office greeting and activate/deactivate it.
  • Learn how to leave a voicemail for someone without ringing their telephone.

The goal is to know the system well. You want people to think of you when they are trying to do something new. When you’re visiting, discreetly listen to the interaction with callers. I cannot tell you how many times I’ve heard “You’ll have to call back and ask the operator” or “His extension is 8244 but you’ll have to call back. I cannot transfer from here”. Try to help these people understand how to use the phone. Of course, some folks don’t want to hear it but some do. Be helpful. Know your telephone system. Be the Authority.

What types of questions do you get?

481 Call Does Not Exist (no local tag match)

Really? I’m the only person on the entire Internet to get this message from an Avaya Session Manager?

481 Call Does Not Exist (no local tag match)

I’m trying to integrate Avaya Aura Session Manager 6.1 with an Audiocodes MP-118. I have it working in one direction so far. If all goes as planned, I’ll get this figured out and will forget all about the cause for this error 481. Alas, I look forward to it already.

UPDATE: This turned out to be sort of an asymmetric route. In the chaos and confusion of the moment, I had Session Manager A sending calls to the Audiocodes, but the Audiocodes sending responses back to Session Manager B. Hopefully this helps someone out there.

Quick fix for Avaya MAS error Access is denied (0x80070005)

It’s not that I’m an “Avaya guy”, but it just happens to be the system I’ve been working with lately. If any of you have tried to publish a caller app on Modular Messaging and gotten the message Error in application deployment (Access is denied (0x8007005)), there’s an easy fix.

Error in application deployment

In the old version 3.1, you could deploy apps via RDP, but now in version 6.x, you’re only able to do it from a local terminal. Or, you can also RDP with the /admin switch:

mstsc /v:mymas /admin

You can then deploy apps remotely. Simple, but since I couldn’t find any quick info when I googled the error, I thought I’d post it here.