Philosophy

You are currently browsing the archive for the Philosophy category.

One of the questions the sipsorcery project has been aimed at answering is whether it’s possible to run a telco in the cloud. By cloud I mean a public platform offering on demand resources at an operating system or database level. The sipsorcery service moved from a single server hosted in private infrastructure to a single server in the Amazon EC2 cloud in July 2009 from which point on the sipsorcery service could be said to have become a cloud hosted telecoms service. For a service like sipsorcery the advantages of operating in the cloud were envisaged primarily as cost, scalability and reliability.

Within about 6 weeks of migrating to the EC2 cloud, and after sorting out a few software issues within the sipsorcery code, the Amazon EC2 failures started and have continued ever since. The symptoms of the failure were that the underlying Windows virtual server hosting the sipsorcery software just dropped off the network and stopped responding to any type of network traffic: pings, RDP, HTTP, SIP, everything. Initially I suspected it must have been a problem in the sipsorcery software but after a lot of effort over 3 or 4 months and from keeping an eye on the EC2 forums for people having similar experiences I became suspicious that it was more than likely an issue with something related to Amazon’s infrastructure either at the network, firewall or host level. Since that initial suspicion I’m now 99% sure the problem is at the host level and that the network driver in the Xen virtualisation software is failing under certain conditions. The strongest evidence for this is that when a sipsorcery instance drops off the network it will miraculously start working again around 3 hours later assumedly after the underlying Xen software clears out its network connections cache or something.

Since 2006 I have invested a lot of time and effort in working with Amazon’s EC2 and S3 products and when the sipsorcery EC2 failures started causing a lot of pain I rationalised that an alternative cloud service provider would most likely have their own issues and it was better to stay with the devil you know. To that end I signed up to EC2’s premium support (at USD$100 per month) so I could log a ticket to resolve the issue. Unfortunately that proved fruitless and the advice I received was to try the latest Windows image provided by Amazon and see if that helped, reasonable advice but it the updated images exhibited exactly the same issue. I cancelled the premium support and resorted to vociferous whinging on the EC2 forum. After a month and half of that approach I eventually got an email from an Amazon engineer who did look into it for me and while not so much acknowledging that the issue was Xen related suggested I try a Windows image that was backed by Elastic Block Store (EBS) storage because it had some additional updated Xen drivers. So with high hopes I toddled off again and built up the sipsorcery image, which I had down to a fine art by now. It actually took a few weeks for the EBS backed instance to fail and I’d actually started to believe it had solved the problem but eventually it did fail and the Amazon engineer didn’t have any further ideas.

So back to the 3 advantages of operating in the cloud, the reliability of the service certainly diminished a lot after moving from a dedicated server but what about cost and scalability? To overcome the repeated failures of the sipsorcery EC2 instance a second failover instance was utilised meaning there were now two sipsorcery EC2 instances required. In addition the size of both the instances needed to be upgraded from small to medium to cope with the increase in sipsorcery users. And often during troubleshooting the failures it was necessary to leave one or two failed instances running in case Amazon ever wanted to check them. Suffice to say the average monthly cost for the sipsorcery EC2 servers is 5 or 6 times what had been originally forecast and is 2 or 3 times more than a dedicated server with equivalent specifications. That leaves scalability. The first scalability problem for sipsorcery was the database. In the initial single EC2 instance deployment a MySQL database sitting on the same instance was used, the very first instance failure and some a misconfiguration of the MySQL database by me resulted in the very first sipsorcery users losing their data. To scale the database a better deployment model was needed. About that time two new products came out, one was Amazon’s Relational Database Service (RDS) the other was Microsoft’s SQL Azure. Amazon’s RDS is based on MySQL and since it’s probably in the same data center as the EC2 infrastructure my preference was to use it. However it didn’t take long to realise it was a poor solution, there was no replication, no clustering the product was simply a single MySQL server sitting on its own instance which was not much different from what sipsorcery already had, no thanks. Microsoft’s SQL Azure was unsurprisingly based on Microsoft’s SQL server and was about 100ms of network away from the sipsorcery EC2 servers but it was still compelling because it claimed to solve two of the most difficult database problems by handling data replication for fault tolerance and having a deployment model that allowed it to scale demand across the SQL Azure cloud. But it was a new product and if the EC2 experience was anything to go by was it worth risking. In December of 2009 I did move the sipsorcery data to SQL Azure and while there have been a few issues that have caused outages to the sipsorcery service in general it has worked extremely well. Even more importantly on one of the issues that I experienced and posted on the SQL Azure forums about I got an email from the Microsoft SQL Azure product manager who got his engineers involved to identify the root cause. And in this case, unlike with Amazon, the Microsoft engineers were able to provide a good explanation and more importantly that particular issue hasn’t re-occurred.

Starting from November 2009 I have kept track of all the sipsorcery failures related to the “clouds” and while I won’t list them all a summary is interesting.

SQL Azure

  • Between 16 Dec 2009 and 25 Mar 2010 there were 22 detected outages totalling 26.9 minutes with the longest being 2 minutes and 43s,
  • On the 28th Mar 2010 an approximately 3 minute outage occurred,
  • On the 9th of April 2010 an approximately 3 hour outage occurred however it is not clear whether this was caused by a network issue or an SQL Azure issue. SQL Azure engineers investigated and found no evidence of a problem and during the outage I was able to connect to the database from outside the EC2 network.

EC2

  • Between 1 November 2009 and 13 Apr 2010 there were 28 detected outages totalling over 32 hours with the longest being 8.5 hours,
  • EC2 outages to date have not occurred simultaneously to both servers (sip1 and sip2) so the outage times apply only to a single server instance and not to the overall service,
  • EC2 outages require a server instance to be manually rebooted which takes a minimum of 15 minutes. I am notified of outages via SMS and email with 30s but depending on my circumstances remedial action can vary between instant and up to 8 hours, on average it’s generally less than 30 minutes.

To answer the original question relating to can a telco run in the cloud my answer is “probably” but if part of that cloud is Amazon’s EC2 then the answer is “probably not”. I know of another SIP based service that has recently started on Amazon’s EC2, unlike sipsorcery they are Linux based, so it will be interesting to follow their experience.

As for sipsorcery I’ve identified a promising alternative to EC2 that I hope to migrate the service to at some stage. One big advantage of this alternative provider is that they have F5 load balancers which would allow the sipsorcery service to be deployed reliably without having to depend on SIP SRV records which have been shown to be pretty poor as a failover mechanism for SIP clients, when a sip1 outage occurs approximately two thirds of the sipsorcery clients drop off and don’t fail over to sip2. However there are some funding challenges involved in migrating sipsorcery from EC2 and I need to come up with some way to pay for the new cloud host.

Aaron

Just in case anyone ends up reading this post hoping for an answer to the radio/podcast question in the title, unfortunately it’s not here. It may be possible, and I’m sure one day it will be, but apart from setting up a media server, which is what sipsorcery is all about avoiding, I don’t know how to do it.

Playing around with publicly accessible media services that can be used with sipsorcery is something I always find interesting and have been doing since the mysipswitch days. I’ve blogged about a previous unsuccessful effort to use a hosted VXML service. I got motivated to write this blog entry after reading a post on the mysipswitch forums by gabbar.singh Free stuffs. The post contains a link to a site called Polinez which purports to assign a dedicated US landline number to arbitrary podcast URLs and thereby make them accessible to any PSTN phone. In turn using a SIP Provider or GoogleVoice to call the US landline number makes the podcasts accessible from a SIP phone. I tested it out with a triplej (Australian radio station) free music podcast and while a number of +16414533901 was allocated when calling it I get either a seemingly random podcast or an unavailable message.

Following the experience with Polinez I did a bit of a hunt around for any other way to connect a phone to a podcast along with any free music on hold or other types of streaming services that would work with a phone. Not that suprisingly I didn’t find that many, the web browser is the predominant access mechanism for streams these days and the phone is largely ignored. It’s a shame because there are times when accessing media via a phone, be it hard of soft, is preferrable and even superior to a web browser. Flash media, as used by Amazon’s Cloudfront and YouTube, are great examples of incredibly convenient ways to stream media but with no way to easily access it from a phone :(.

The best I can come up with at the moment are a few numbers that provide streaming on hold music. Like the SIP Application Servers if anyone knows of any others I’d love to hear about them.

  • sip:305@blueface.ie music on hold from Blue Face’s Asterisk server,
  • sip:music@iptel.org fado of Anamar provided by iptel,
  • sip:early_music@iptel.org same as above but this time as early media,

Update 23 Jan 2010 I was mucking around with Tropo to see about getting Blind Transfers working when I spotted that it’s possible to playback mp3’s directly from a Tropo application, something I’d missed before. I know my favourite radio station triplej has a live mp3 feed so that got me wondering whether I would finally have an easy way to play live radio on my IP phone?! The Tropo app required is amazingly only two lines:

answer
say "http://202.6.74.107:8060/triplej.mp3"

My initial test calls failed which made me think that the Tropo server was not able to play mp3 streams and instead needed to fully download and mp3 files it needed to play. I posted a query on the Tropo forums just in case I was missing something. A Tropo staffer responded almost instantaneously that playing mp3 streams was supported and that he was able to connect to the triplej stream without any problems. I tried another few times and after about 4 or 5 calls had success! I am still only able to get the occassional call to connect to the triplej stream but I think that’s the streaming servers’s issue not Tropo’s as I have the same problem trying to connect from Windows Media Player. The reliability aside that’s the first time I’ve been able to listen to a live radio stream on my phone without having to come up with a custom solution involving Asterisk or the LIVE555 Streaming Media Server, AMAZING!

A “what the heck are you doing it like that for?!!” post popped up on the Forums today and I thought that given I end up answering the question every few months I’d copy it here.

You are correct that your observations have been made by others quite a few times in the past 6 months. I’d also agree that the user experience offered by sipsorcery is pretty poor; the help documentation is next to nothing; the user interface is far from universal and I’m sure contravenes most good usability principles; the list could go on…

Most people have come to associate open source software with being free and generally of a reasonable quality. What is normally overlooked is that successful open source applications – for the sake of argument I’ll classify successful as a project you have had the inclination to use – generally gain a bit of momentum as they grow and pick up a few extra hands to help with the programming, documentation, web site etc. To date that hasn’t happened with sipsorcery/mysipswitch and that’s in part why there are a large number of shortcomings.

At the moment the sipsorcery project consists of:

    1. Me writing the software in my spare time,
    2. The sipsorcery.com and associated developments servers hosted on Amazon’s EC2 and Microsoft’s Azure platforms for a cost of around USD500/month,
    3. Me administering the sipsorcery.com service in my spare time. A large portion of which time goes into shutting down fraudulent users attempting to exploit SIP Providers,
    4. Packaging up what is really architected as a centralised server application into a local install for those people with a high enough pain threshold to attempt to run the software on their own machines.

(Another developer, Guillaume, was previously able to help in the mysipswitch days but his work is now too busy)

Luckily I highly enjoy all those activities and get a kick out of keeping the whole thing working.

My priorities are:

    1. To make the sipsorcery.com service highly reliable. A short sentence with a very scary amount of work involved (it’s now been over two years since mysipswitch went from a pretty solid single process application to a multi-process, multi-server application with some very difficult stability and scalability problems to solve),
    2. Provide a REST API into sipsorcery.com to make it easier for anyone so inclined to come up with an alternative user interface,
    3. Expand the Siverlight interface to become a real-time call switchboard.

In answer to you main question “…what the heck are you folks doing…”, which is actually “…what the heck are you, Aaron, doing…”, the answer is whatever most interests me. It interests me to be able to write and run a highly reliable SIP platform. Unfortunately, and I’m not being facetious, it doesn’t interest me to write javascript/HTML/AJAX based interfaces, I spent a decade doing that and just got completely sick of it.

You or anyone else are more than welcome to contribute to the project in any way you can or try and encourage another programmer to develop the features you want. I’m not going to develop every feature ever requested but if someone else develops it and it’s useful I’ll happily host it on sipsorcery.com.

As the VoIP (which these days means something like 50% SIP, 40% Skype, 5% IAX and 5% Other) provider service industry has matured over the last 5 years the providers that have manged to survive have come to the realisation that a business based on transitting voice, which is the foundation of the telecoms industry, is actually a tough business to be in. Without the advantage of owning the single wire that runs into the customer premisis VoIP providers are competing not just on a global stage but also with a product that is rapidly converging towards a cost base whereby big players can offer it for free. Google Voice is the classic example with a service that currently offers free calls to the US and Canada and undoubtedly more destinations to follow as the service picks up steam. It makes it pretty tough for other North American based VoIP providers to compete with…

What the surviving organic VoIP Providers have realised is that the most attractive segment of the market is business customers, not because they spend more on calls but because they are more likely to be interested in extra features like a hosted PBX or an IVR. Residential customers are more interested in cheap/free calls with no bells and whistles and that results in razor thin margins. At the moment smaller VoIP Providers that have their own number ranges have the advantage that in most countries porting numbers is still onerous however as regulators and technology improves number porting inertia will quickly dissolve as customer retention mechanism, which is of course a good thing.

Ultimately voice services will come to resemble email services. Traditional telcos and ISPs will bundle a basic service into their broadband products, web portal companies such as Yahoo, Google, Microsoft et al. will also offer a basic voice service that will integrate with their other offerings paid for by eyeball ownership when people check their voicemails etc. Dedicated VoIP Providers will continue to exist but will be thinned out to those offering specialist services to power users and business customers who will compete with the less nimble traditional telcos who will always be a couple of steps behind snapping at their heels.

The other thing that will happen is that a voice service won’t actually be a product at all instead it will evolve into a personalised media service starting with video which is already available on Skype, the eyeball portals via their IM networks and the more advanced SIP providers. Eventually it will reach a point where each person has multiple streams under their control and where at least one will be permanently connected. Personal streams will replace broadband connections, 99% of the population aren’t interested in IP addresses and routers, what they are interested in is being able to control the media on their TV, IP Phone, computer display etc. whether that media happens to be an interaction with another person, watching a movie, playing a game, attending a business meeting etc. is what people are interested in. Where the successor to the SIP protocol comes in will to be handle the signalling that makes switching the content of people’s streams seamless, the mechanism to place a call to talk to someone will be the same as a call to watch the latest movie rather than all the different controls and applications that currently exist.

That’s the future but how will it call come about? In the near future writing streaming media applications will become the same as writing a web application. Once that happens there will be an explosion of new voice/media applications, beyond click-to-call and video blogs, and VoIP Providers will be assimilated into software consultancies or vice-versa since they will be the same thing, instead of “web apps” we will have “web streams” undoubtedly coined as “Web S.0” or something equally geeky. In order for streaming media applications to reach the same level of ubiquity as static web applications new application servers are needed. The likes of FreeSWITCH, Asterisk, Wowza and Voxeo are leading the way – the likes of Sun, IBM and Microsoft also use all the SIP buzzwords in their niche products – but at the moment they and similar products require a higher level of expertise than the average web developer possess and more importantly they are not suitable for web hosting providers to deploy in their farms. Once the latter problem is solved the former will closely follow and when it does internet applications will break out of their browsers and expand to include, IP Phones, fax machines, the PSTN, mobiles and any other digital device or analogue device that is worthwhile enough for someone to have produced an Analog-to-Digital converter for.

While these streaming web application servers are gestating a bunch of specialised but limited services have sprung up to attempt to fill the void.

And there are undoubtedly more similar services around and I’m always interested to hear about them if anyone knows of any.

All of the services listed are limited in the types of web streaming applications that can be developed due to the tight integration between the development environment, signalling platform and media gateway. In addition the business model employed in a number of cases is too restrictive, for example forcing applications to adopt a per minute charge for users severely restricts the appeal to developers and in turn the users they are developing for who are all used to the much more flexible web application models: freemium, content, subscription etc.

The experience from mysipswitch/sipsorcery which due to the flexibility of Ruby dialplans are a type of streaming media application server has demonstrated that the key to such applications is to separate the signalling from the media which surprisingly is something none of the above services do, well it’s not so suprising given the business models employed, if you’re charging applications by the minute it’s the media you’re billing for not intelligent signalling. There are two huge advantages to a streaming application platform that has separated the signalling and media.

  • Media capabilities are limited by end-user devices rather than the application server. Softphones, IP Phones and smartphones such as the iPhone advance at a rapid rate and will invariably introduce media related features that are not supported by an application server. The signalling layer tends to be more stable, there are only so many ways to initiate, transfer and hangup a call.
  • Advanced media service providers can be cherry picked. Different service providers offer specialist services: text-to-speech, face recognition, speech transcription etc; and an application developer would benefit enormously from being able to use different services in their application rather than being constrained to the offerings or lack thereof from a single service provider.

All in all it’s an exciting time to watch the evolution of the streaming web.