Wed, 17 February 2010
In this podcast we discuss the emerging threat of steganography in voice over IP.
This is really interesting - is it something that is already happening?
Currently, this seems to be confined to research labs. The primary reference for this podcast is an IEEE Spectrum article by three professors - Józef Lubacz, Wojciech Mazurczyk & Krzysztof Szczypiorsk - at Warsaw University of Technology. This is part of their ongoing research, as part of the Network Security Group, to identify emerging threats and develop countermeasures.
Before we delve into this new topic, lets provide the audience with a little background. First what is steganography - sounds like a dinosaur?
Yeah - the Stegosaurus. I'm not sure how or if the two are related; we'll leave that one for the Paleontologists in the audience.
Steganography is something that has been around a long time - some say as far back as 440 BC. While encryption takes our message and scrambles it, so that an unintended recipient cannot read it, steganography attempts to hide or obscure that a message even exists. The researchers refer to steganography as "meta-encryption." Another useful analogy they use is to refer to the secret message and the carrier within which it is hidden.
Can you give us some examples?
If we start in ancient times, we can point to examples of shaving a messengers head, tattooing a message on their head, letting the hair grow back and sending them off. Other examples include using invisible ink or even writing on boiled eggs with an ink that penetrates the shell and can be read by peeling the egg. Simon Singh's "The Code Book" is a great read that details the history of encrypting and obscuring information.
What about some more modern examples?
When we refer to modern steganography we are usually referring to digital steganography. Digital steganography takes advantage of digital data by (for example) hiding a message within images, audio, or video files. In this case the image, audio or video file is the carrier. The larger the file (image, audio or video) the larger message it can carry. The researchers contend that a single 6-minute mp3 audio file, say roughly 30 megabytes in size, could be used to conceal every play written by Shakespeare.
So how does this work?
Say you and I wanted to communicate using steganography. We would each download one of the hundreds of freely available stego apps. You would take a fairly innocuous image file, use the software to embed a message into that file, and send me the altered file. To anyone else, this would just look like a photo you're sharing with a friend, but because I know there's a hidden message, I open with the same stego app and read the hidden message. You could also add a password to further protect the message.
So how do we stop this?
This is a specialized field called "steganalysis." The simplest way to detect a hidden message is to compare the carrier file - our innocuous image - to the original. A file that is larger than the original is a red flag. This of course presupposes that you have access to the original file. In most cases, this will not be the case, so instead, we look for anomalies. Is the audio file significantly larger than a 3-minute audio file should be? We can also use spectrum analysis or look for inconsistencies in the way the data has been compressed.
How would spectrum analysis help?
Some steganography techniques try to take our digital data and modify the least-siginificant bit. In our digital data the LSB often just shows up as noise and doesn't effect the image, audio, or video quality. A spectrum analyzer would help us to compare the "noise" in an unaltered sample and to try and identify anomalies.
Wow - that's scary stuff. What about Voice over IP[is this part OK]?
Voice over IP or ("voype") is a transmission technology that enables us to deliver voice communications over IP networks such as the Internet. This is an alternative to using the traditional PSTN or public switched telephone network for voice communications. In VoIP, we take our analog voice signal convert it to a digital signal and "chop" it up into smaller pieces called IP packets. These packets are sent over our data network and reassembled at the destination.
To understand packet-switched networks, consider the US Postal system – our packets are analogous to postal letters or parcels, numbered, sent across a network and re-assembled at the receiving end. Packets do not follow the same path from source to destination and may even arrive out of sequence. In VoIP, it's more important that we transmit our data quickly, so we forego the numbering or sequencing.
So what about this new class of steganography?
One of the disadvantages of existing techniques is the size limitation of the carriers. If someone tries to put to large a message into an audio file, it becomes easier to detect. With VoIP, our message is hidden among the packets - even bits - of voice data being transmitted. In a sense, older technologies used a digital file as the carrier, while these new, emerging techniques use the communication protocol itself as the carrier. The size of the hidden message is only limited by the length of the call. While detecting a hidden message in a physical file is not trivial, the difficulty of finding a hidden message increases an order of magnitude when there is no physical file to examine. The researchers are calling this new class of steganography - "network steganography."
So how does network steganography work?
The researchers have developed three methods that all manipulate the IP or Internet Protocol and take advantage of the fact that this is a connectionless and unreliable protocol. Network steganography exploits errors (data corruption and lost packets) that are inherent in the Internet Protocol.
What are the three methods?
The three methods or flavors of network steganography that the researchers have developed are:
Briefly, LACK hides a message in packet delays, HICCUPS disguises a message as noise, and Protocol Steganography uses unused fields in the IP protocol to hide information.
So let's talk a little bit more about each - first LACK.
VoIP traffic is very time sensitive - if a voice packet (about 20 milliseconds of conversation) is delayed, we can continue our conversation without significantly effecting the call quality. Once the delayed packet does arrive at the receiver, it's already too late; the packet is useless and is either dropped or discarded. That's the way VoIP is designed to work. LACK intentionally delays some packets and adds the "steganograms" in these intentionally delayed packets. To an unintended recipient, these packets appear to be late and are discard, but to the party you're communicating with they are retained and decoded to extract a hidden message. LACK is a simple technique that is hard to detect.
What about HICCUPS?
HICCUPS works on wireless local area networks and takes advantage of corrupted packets. Normally, in a wireless network, we check for corrupted data by examining the checksum of a received packet. If the checksum doesn't match what we expect, we discard the packet. HICCUPS hides our message - the steganograms - in these seemingly "corrupted" packets. Unintended recipients will discard these packets, but our cohort knows to look for these "corrupted" packets and to retain and examine them. This method is difficult to use, because it requires a NIC card that can generate incorrect checksums. It is also difficult to detect.
Okay what about Protocol Steganography?
Here, we're hiding our message in the actual header fields of the IP packet. In particular, we're hiding information in unused, optional or even partial fields. To make it even harder to detect, we could use fields that frequently change.
So, should we be worried?
I don't think so. The majority of the steganography applications seem to be focused on altering images, which appears to be the easiest form of steganography. While the techniques these researchers have developed are technically feasible, I'm not sure that they're easily implemented. There has been lots of speculation regarding terrorist organizations using steganography to communicate however, no one has been able to document that this has actually happened. That said, I have no doubt that these groups are exploring ways to mask their communications and that the NSA has developed and uses a wide arrays of tools and countermeasures for steganography.