While setting up my Nokia E51 w/ VOIP I was informed that the communication between the handset and the server uses the ITU-T G.711 µ-law codec for the audio without any additional encryption, meaning that it is relatively easy to capture and listen in on. I’d never done a VOIP capture and decode, so I set set up a capture on the firewall (tcpdump -i gem0 -s 2000 -w file.cap host x.x.x.x) and grabbed a test phone call made to Danielle as she sat in the living room with some friends.
After opening the capture in Wireshark I used the basic built-in VOIP analysis tool to get the windows shown above. The main window is the capture and decode itself, another shows the one detected VoIP call and its details, and the third is a basic playback window replying the voice of the phone call. (Click on the image above or here for a full resolution copy of the screenshot.)
Using the RTP stream analysis stuff one is able to save out the audio as an .au file. I was running into some problems with this as one half of the conversation was padded by a few minutes of silence during export (a Wireshark bug, it seems), but the audio is still very much available. Both halves of the conversation were then brought it into Audacity, aligned, the level of the inbound (remote, Danielle) side was brought up a bit, and the audio was exported it as an MP3: voip_capture_sample.mp3.
This capture and decoding was easy for me to do because of the ready access to my own network and lack of encryption of the session. Getting another person’s calls is generally a bit more complicated. That said, imagine how easy it must be for a large government agency with a tremendous budget, amazing computing resources, and access to the backbones of the country’s telecommunications infrastructure.
Leave a Comment