Imagine that you are working over a perfectly secure network with nobody in the middle, no key loggers, sniffers or any other sort of spyware over the local workstation or in the communication channel. Can you be sure that no information will be exposed?
Think again. Snooping at the digital level is not the only way to sniff at computing activity. The sound that your fingers make while touching the keyboard, emissions from your monitor and the noise from your printer, CPU and hard disk can be used to find out what you are doing at your PC.
The amazing technology which makes this possible is called Acoustic Cryptanalysis in which computation sounds during input-output processes are analysed. Generally, such attacks are called side channel attacks, as they exploit seemingly secure systems for unintentional interesting leakages.
Recall how people can recognise on the phone which number is pressed by differentiating the individual sounds. In a similar way, when we punch in particular words using the keyboard, a distinct sound pattern is generated because each key has a distinct sound attached to it which can be analysed. If repeated probes are possible, the well-established algorithms of machine learning and statistical methods can be applied to correctly reproduce up to 90 per cent of the typed text.
Asonov and Agrawal are credited with experimentally showing how this works and later on other researches followed up on their work, enhancing the techniques and methodologies. Let us take a brief look at how this works.
To a human ear, each key sounds the same. But when their direct frequency spectrum recording is examined using a simple PC microphone, several subtle distinguishing features are discovered. Notably, there are two distinctive peaks in the keyboard sound spectrum.
The first one comes at the time of the push, called “push peak”, and the second at the time of key release, known as “release peak”. These readings, called features, are then normalised between 0 and 1 so that they may be used as input data for a neural network which records the sound differences with continuous inputs, say, pressing each key 100 times as sample training data.
This is the technique used by Asonov and Agrawal. They further refined their feature collection by zooming in on push peaks to realise that the same further consist of two peaks — “touch peak” where the fingers touch the keyboard, and “hit peak”, when the key actually comes in contact with the keyboard plate underneath it.
Experimentation showed that touch peak was best suited to show the vulnerability of keyboards. Once the neural network is trained with keys and their corresponding features, it can successfully detect typed input to the tune of more than 90 per cent, much on the lines of known plain text cryptographic attack.
This is not only true in the case of normal PCs but also for notebooks, ATM machines and telephone pads, which means PINs and other identification information can also be sniffed by simple and inexpensive acoustic cryptanalysis that requires just a mike and a programmed application.
Three other researchers — Li Zhuang, Feng Zhou and J.D. Tygar of the University of California at Berkeley — experimented acoustic emanations and advanced the technique discussed above by removing the constraints of neural network learning with the same keyboard used by the same person to achieve high accuracy.
They argued that without these ideal conditions, accuracy rates drop drastically to 25 per cent. This research also argues that instead of first learning individual keys and detecting them in type sequences, learning language pattern provides a better alternative because English language and grammar rules are applicable most of the time during a normal keyboard input session.
This is much like breaking a text ciphered through substitution, as it involves study of language features. Recall the short stories in which a detective looks for the character substituted for “e” (being commonly used), and finds out instances of word “the” (being easy to locate due to reoccurrence) and thus breaks the substitution sequence to decrypt the cipher text.
Before applying language knowledge, Li, Feng and Tygar also improved the initial classification technique by using cepstrum to gather features out of keystroke sounds. This method is used extensively in voice recognition and produces better results as compared to techniques used in earlier experiments.
Second improvisation is classifying keystrokes in classes such that one class may contain more than one keyboard character or one character may appear in more than one class.
The number of classes is a bit greater than the number of actual keys. Each key has a probability assigned to it for being in certain class and so there is no fixed key-to-class mapping. At the time of detection, when a decision has to be made to determine in which class a keystroke will fall, Hidden Markov Model (HMM) is used. For example, if a user types a character that could either be “o” or “p”, a detection problem is at hand because both are adjacent keys. But if the last character was recognised as “y”, HMM would suggest the sequence “yo” instead of “yp” since the former is more common with words like “yoga”, “yolk” and “you”.
Results thus obtained are further polished by presenting them against a dictionary for spell-check and grammar-check, again using HMM. As words are corrected, results are used to further improve the detection making the accuracy rate go up with the passage of time. For phrase correction, n-gram language models are used to evaluate association between neighbouring words and their frequencies.
Such advanced techniques can reveal more interesting results, for instance if a URL of an email site or ALT+CTRL+DEL is detected, the next most likely input is going to be username and password. Applying this sort of knowledge, the results boosted to a whopping 96 per cent and a 90 per cent accuracy rate was achieved for random text.
Based upon these findings, these researchers truly claim that emanation attacks are far more challenging, serious and realistic than previously realised. Now, one might ask what actually makes a keyboard produce different sounds in the first place.
It is basically the plate beneath the keys, which when struck at different positions by different keys, produces distinguishable sound — much like hitting a drum surface at different locations. Asonov and Agrawal have experimentally proven this hypothesis by actually cutting a keyboard with milling machine and observing that the neural network was no more able to identify the keystrokes.
So what could the counter-measures against such sophisticated attacks be? Firstly, make sure that the computing facility is not bugged with hidden microphones or transmitters, particularly in sensitive environments. Note that even when a facility is free of spying devices, sounds can be picked up from outside the room.
This can be done using parabolic microphones placed at a considerable distance for eavesdropping purposes or other spying techniques. Where physical screening is not possible, nearly silent keyboards (not the commonly available quiet keyboards) should be used, like the ones with rubber keys to ensure no intelligible sound is compromised. Creating homophonic keyboards is also a solution, possibly by using multiple plates for keys or a vibration-free plate.
Using other forms of keyboards like touch screen or light-rendered keyboards that draw a virtual keyboard on hard surfaces or even in the air are also promising options, albeit costly ones. For securing against passwords sniffing only, a combination of authentication mechanisms can be employed like smart card and biometrics.
Apart from keyboard sounds, it is also possible to mount timing attacks against a computer by studying the humming sound (or characteristic acoustic spectral signature) during encryption or decryption process. Timing attacks measure the time it takes to produce an encrypted output given a certain secret key.
Knowing exactly how long it takes for a known algorithm like RSA to encrypt, say word “science”, with different key lengths, the attackers attempt to guess the key. Such attacks have been demonstrated by Adi Shamir and Eran Tromer.
To guard against such attacks, the CPU can be placed in a soundproof case, a constant masking noise can be introduced in parallel or the circuitry can be enhanced to produce low emanations.
Side channel attacks are not new. Back in 1956, MI5 is said to have used acoustic attack against Egyptian Hagelin cipher machines in an operation code-named ENGULF. And the US set of standards called TEMPEST calls for devices such as computer chips, monitors and printers to limit electronic, electromagnetic, radio and other emanations.
Much of its details, including the testing process, remain classified. But the area has been taken up for active research by notable universities and computing companies which have increased public awareness and interest in this field.
Acoustic emanation is just one of the many ways through which computer security can be compromised. This once again reminds us that computer security fits into an overall scheme of protection and is not limited to any particular aspect of information hiding.
The writer is an IT professional and a freelance contributor