Greater stability on the horizon?? Maybe…

As a some people have noticed there have been a few “improvements” going on with the sipsorcery service in the last few days. The main thrust of the improvements has been to move closer to a solution on the “long running dialplan/memory leak” issue that has plagued the mysipswitch and more recently sipsorcery services. The cause (and consequences) of the problem are discussed widely on the Forums site and also briefly in this Blog Post.

At this point I’m hopeful that the latest changes will finally solve the issue however I’ll wait for at least a week of stable behaviour before jumping to any substantial conclusions. Very briefly the new approach has been to give up on attempting to isolate the cause of the memory leak somewhere in the interaction between sipsorcery, the DLR and IronRuby and instead accept the leak but isolate it into a new process and recycle the process once it hits a certain memory utilisation.

In theory the idea doesn’t sound overly complex but it meant another round of pulling apart functions that were used to relying on being all within the same process. I’m getting fairly used to it at this stage though. The original mysipswitch service was all wrapped up nice and tightly in a single process. It was when the memory leak first cropped up that the extraction of different mysipswitch functions into different processes started and the single process application has now evolved into the system shown below.

SIPSorcery Deployment Diagram

SIPSorcery Deployment Diagram

Larger deployment diagram available here.

The trickiest thing ended up being processing calls forwarded to another sipsorcery user that need to use the called user’s dialplan, i.e. one call generating two or more dialplan instances. To be honest coding this up seriously hurt my brain. It starts off ok, one call arrives and drops into the dialplan, that dialplan calls a second sipsorcery user who has specified incoming calls go via one of their dialplans, the second dialplan calls out to a 3rd party SIP Provider. So far so good. But in order to generate the correct call detail records (CDRs) so that both the caller and called users have an accurate record the call between the two dialplans has to generate an extra two CDRs, to do that it means creating two additional SIP transactions. So that’s now 2 dialplan instances, 4 SIP calls and 4 SIP transactions. But then instead of calling an external provider the second dialplan could call another sipsorcery user who also uses an incoming dialplan. AHHHH -> Brian Pain. The flowchart of the whole things gives a bit of an idea of the complexity.

SIPSorcery State Diagram - Dialplan-to-Dialplan Call Processing

SIPSorcery State Diagram - Dialplan-to-Dialplan Call Processing

Larger state diagram available here.

So the upshot of all that is that the new isolated process mehanism is now in place and if all goes according to plan there will be no more outages caused by the dialplan processing memory leak (that’s a “hopefully” no more outages not a guarantee and only for the memory leak issue) . At the moment the dialplan worker process is set to recycle once it hits a working memory set of 150MB whenthe secondary worker process will take over until the primary one has completed the recycle. So far it’s working very well. There were a couple of minor hiccups today when the update went in. One caused the “Dial plan script engine was overloaded” error message but that was quickly resolved.

Apart from that a few other minor changes and miscellaneous points that have sprung to mind are:

  • Duplicate call destinations are now being ignored in dial strings. For some reason some users were blasting a single provider with the same call with 10 different usernames. That’s not very friendly behaviour and is now being prevented.
  • I’m working on a blog post and some diagrams to explain how the audio (RTP) streams get set up when using the sipsorcery service. There are a few people mentioning one way audio issues on the forums, which is not a new thing, and there have been some changes from mysipswitch to sipsorcery that were intended to improve the RTP set up that warrant mentioning. Suffice to say the whole audio set up process relies on two IP addresses that are contained in the body of the INVITE request and response. The only thing the sipsorcery software ever attempts to do is mangle the IP address in the body if it is private. I’ll explain more about this in a subsequent post.
  • There have been a few requests to access to CDRs from the sipsorcery system. There is a web service interface for .Net WCF clients available at https://www.sipsorcery.com/provisioning.svc, I believe other SOAP clients should also be able to access it using the WSDL from https://www.sipsorcery.com/provisioning.svc?wsdl but haven’t verified that. You need to be able to program against web services in order to be able to use it. Down the track I would hope to expose a more user friendly way of getting at this sort of data but don’t know when that will be. The provisioning service actually exposes all the sipsorcery functions and is what the Silverlight client uses to communicate with the backend. For those people that didn’t like the Silverlight interface it’s feasible that a different AJAX/HTML based UI could be built which calls that service.
  • On a less technical note the first month or so of the sipsorcery service has been a pretty hectic one. Not withstanding the prominent issues such as the database crash and the memory leak it’s been a bit of a running battle keeping up with sipsorcery users to keep the service functioning smoothly. A big difference between mysipswitch and sipsorcery is that I tried to anticipate and avoid more of the minor issues that impact the service performance. As just one example people put invalid entries in as a registration contact for a provider. That can result in the sipsorcery registration agent wasting time doing DNS lookups or trying to contact a non-existent IP address. I stopped all the obvious things: a user registering a contact back to sipsorcery which is compeletly pointless; making sure a contact host is an IP address or hostname; mkaing sure that the contact have a user portion and a host portion etc. etc. Despite all that registration contacts like sip:me@U.S.A still appeared so more rules need to get added and bad registration contacts disabled. It does take a bit of admin time keeping the sipsorcery service running over and above trying to sort the software out.
  • Phew! That will do for now.

    Aaron

    1. Huib’s avatar

      Hope the brain pain is subsiding a bit by now. Thanks for continuing in making sipsorcery a great service!

      Reply

    2. jvwelzen’s avatar

      Hi Aaron

      Thanks for the explanation

      I can’t wait for the blog post about Audio RTP streams

      Will you also explain why audio streams do not work with an incoming dialplan

      This helps alot Thanks

      Reply

    3. jvwelzen’s avatar

      Hi Aaron

      Do you have an indication when you wil post a new local version

      I think I have everything work how i like it with the online version

      But with the local version the same setup doesn’t seem to work yet

      I really want to test the local version with the same setup

      Thank in advance

      Reply

    4. Tuketu’s avatar

      Aaron:
      Thanks a ton for the hard work and thanks again for the blog documentation and explanations. I’m also looking forward to the blog post about the RTP streams.

      I’m trying to get my head around your Deployment Diagram…any clues about the significance of the different colors in the arcs?

      Reply

      1. sipsorcery’s avatar

        Sure (I didn’t think anyone would be intereseted in the diagram details and posted them give a broad idea rather than details):

        – Each outer black cricle represents a process,
        – Each inner coloured circle represents a thread (the purple ones are threads performing SIP stack tasks, the blue ones anything else),
        – The arrows represent traffic flows or inter-process communications:
        – Purple represents SIP traffic,
        – Green represents database connections,
        – Yellow represents a SOAP web service call,
        – Blue represents a trace message (the messages that show up in the telnet and Silverlight monitoring consoles),
        – Orange is for the NAT keep-alive requests from the Registrar to the Proxy.

        Reply

      2. Kenzo’s avatar

        Well, great job again!
        After all it was good having to reconfigure everything, since I cleaned up and improved a lot my Dialplans… hehehe…
        And I’m also waiting for the RTP post, that should be good!

        And some help with the WebService would be good too!

        Reply

    Reply

    Your email address will not be published. Required fields are marked *