For some reason after being completely disinterested in doing anything with the RTP and audio side of VoIP calls for the last 5 or so years suddenly in the last month I decided to explore how well a .Net based softphone would work. Consequently I started tinkering around with a .Net library called NAudio that I’d seen mentioned around the traps. For my purposes NAudio provided a convenient way to get at the underlying Windows API calls for interacting with audio input and output devices. It took a little bit of time and effort to get things working but eventually I was able to successfully read audio samples from my microphone and write samples to my speakers through a test .Net application.
The softphone is open source and available in a binary form here and the source is availabe here in the sipsorcery-softphone project. Before going any further it should be noted that the softphone is extremely rudimentary and geared towards developers or VoIP hobbyists wanting to tinker rather than end users looking for trouble free calling. The user interface is extremely lacking and there are also crucial components missing such as echo cancellation, a jitter buffer, codec support (G.711 u-law is the only codec supported) etc.
My original verdict on using .Net as a softphone platform was that it was not particularly good. This was due to the fact that the microphone samples coming from NAudio were only capable of being delivered with a sample period of 200ms which is useless since the in practice the jitter buffer at the remote end will drop any packet over 50 or 100ms. However it turned out that a combination of some inefficient code in my RTP packet parsing and the fact that I was testing by running the softphone in Visual Studio debug mode was responsible for the high sampling latency. Once those issues were removed the microphone samples have been delivered reliably with a sample period of 20ms exactly as required. I was thinking i I ever wanted to have a usable softphone I’d have to move the RTP and audio processing to a C++ library but now I’m starting to believe that’s not necessary and .Net is capable of handling the 20ms sample period.
The other thing worth mentioning about the softphone is that it’s capable of placing calls directly to Google Voice’s XMPP gateway. I’m still surprised that none of the mainstream softphone developers have bothered to add the STUN bindings to their RTP stacks so that they could work with Google Voice. In the end I decided I’d just prototype it myself just for kicks. For a softphone that already has RTP and STUN protocol support adding the ability to work with Google Voice in conjunction with a SIP-to-XMPP gateway (which SIPSorcery coudl do) would literally be less than 20 lines of code.
Hopefully the softphone will be useful to someone. Judging on the number of queries I get about the SIPSorcery softphone project and the questions about .Net softphones on stackoverflow I imagine it will be.