Category Archives: voice

I recently went through an ordeal with a PBX resetting. It’s an Avaya system using an IPSI to connect a port network back to its host, but this situation applies to anyone out there using QOS on their MPLS network. I’ve often said that being a “phone guy” is rarely about phones anymore. Most of my work – certainly troubleshooting – involves IP networking.

So I had a PBX with one IPSI that would occasionally reset. Since there was only one IPSI, the reset would cause all cards in the port network to reset as well, which would drop all calls in progress. Now this is about the worst thing that can happen when you’re responsible for the telephones. Full system outages are easier to understand. This is a reset, calls drop, users get frustrated and re-establish their calls, then it would reset again. It was a really bad situation.

What is causing the resets? Avaya said the heartbeats were failing to the IPSI. For any of you with an IPSI-connected port network, you should occasionally look for these. SSH to your Communication Manager and cd to /var/log/ecs. Then list the log files. Assuming you’re in Feburary 2013, you would look for missed heartbeats in your ecs log with the command:

cat 2013-02*.log|grep checkSlot
:pcd(5561):MED:[[3:0] checkSlot: sanity failure (1)]
:pcd(5561):MED:[[3:0] checkSlot: sanity failure (2)]
:pcd(5561):MED:[[3:0] checkSlot: sanity failure (3)]
:pcd(5561):MED:[[3:0] checkSlot: sanity failure (4)]
:pcd(5561):MED:[[3:0] checkSlot: data received replacing sanity message; socket delay is 14 secs]

I have stripped the date/time; you’ll see those on the left. Port networks and IPSIs are zero indexed, so the messages above apply to port network 4 and IPSI number 1.

I have been told that occasional sanity failures are just a part of life. These heartbeat messages are part of the Avaya protocol, not ICMP. So if you’re missing heartbeats, it’s not because ICMP is being dropped.

However, after a certain number of sanity failures, the IPSI will reset in order to re-esablish communication. How many sanity failures? That depends upon a system parameter setting:

display system-parameters ipserver-interface
IP SERVER INTERFACE (IPSI) SYSTEM PARAMETERS

SERVER INFORMATION
Primary Control Subnet Address:
Secondary Control Subnet Address:

OPTIONS

Switch Identifier: A
IPSI Control of Port Networks: enabled
A-side IPSI Preference: disabled
IPSI Socket Sanity Timeout: 15

QoS PARAMETERS
802.1p: 6
DiffServ: 46

The IPSI Socket Sanity Timeout determines how many sanity failures will cause an IPSI failover (if you have two in your port network), or a reset(!) if you only have one. The reset is the IPSI’s way of trying to re-establish communication. If you get too many sanity failures, you’ll get this message:

:pcd(5561):MED:[[3:0] checkSlot: too many sanity failures (15)]

Unfortunately, this means my CM lost connectivity to the first IPSI on port network 4. If I only have one IPSI, then the IPSI and all cards in the port network will reset. If I have a redundant IPSI, then the port network will failover and everything should be okay. In my particular case a second IPSI would not have helped me. It turns out, my MPLS carrier (who had also set up our edge routers) was policing the committed access rate. I’ll explain with more detail in my next post. The resolution was to shape the traffic rather than police it.

Recommendations to the new phone administrator

I’ve been working on telephone systems for a while. And I love my job. For the past few years, I find myself working with network administrators who have been handed the job of managing the telephone system. It makes sense – the PBX is just a big voice router, and nowadays the telephones are IP network endpoints.

But there’s more to managing a voice network than knowing the data network. I’m often asked by the new telecom admin “where should I start?” There’s a lot to know. And my biggest piece of advice is to Be the Authority. By this I mean you should be the person everyone asks about telephones. And you should usually start with the telephone and your voicemail system. The telephone is a complicated endpoint. Voicemail has a ton of features and an extremely limited user interface. For example, learn how to do the following:

  • Know how to transfer a call into voicemail without ringing the station.
  • Know how to conference two parties together. This includes two inbound calls. Also, learn the limits of conferencing. How many parties can conference together?
  • Can your users transfer calls outside the PBX (i.e. to mobile numbers)? If so, what happens if voicemail picks up at the far end. How do you pull that call back? What about when you attempt to conference rather than transfer?
  • Learn what all the feature buttons do, like park, call pickup, do-not-disturb, or any one of about 200 possible features.
  • Know how to program the speed dial buttons.
  • Keep a list of conference rooms and the speakerphone numbers handy.
  • Get to know you receptionists and find out what they need in a telephone system. They probably wish they had an accurate company directory, right? In a later post I’ll talk about how to provide this.
  • Spend time walking the floor and interacting with users. When someone calls for a simple change that can be performed remotely, go visit the user or at least give him or her a call. Try to chat about how they use the phone.
  • Learn how to create an out-of-office greeting and activate/deactivate it.
  • Learn how to leave a voicemail for someone without ringing their telephone.

The goal is to know the system well. You want people to think of you when they are trying to do something new. When you’re visiting, discreetly listen to the interaction with callers. I cannot tell you how many times I’ve heard “You’ll have to call back and ask the operator” or “His extension is 8244 but you’ll have to call back. I cannot transfer from here”. Try to help these people understand how to use the phone. Of course, some folks don’t want to hear it but some do. Be helpful. Know your telephone system. Be the Authority.

What types of questions do you get?

Quick fix for Avaya MAS error Access is denied (0x80070005)

It’s not that I’m an “Avaya guy”, but it just happens to be the system I’ve been working with lately. If any of you have tried to publish a caller app on Modular Messaging and gotten the message Error in application deployment (Access is denied (0x8007005)), there’s an easy fix.

Error in application deployment

In the old version 3.1, you could deploy apps via RDP, but now in version 6.x, you’re only able to do it from a local terminal. Or, you can also RDP with the /admin switch:

mstsc /v:mymas /admin

You can then deploy apps remotely. Simple, but since I couldn’t find any quick info when I googled the error, I thought I’d post it here.