EarSpy Attack allows Android snooping via ear speaker vibrations
Researchers have developed a new eavesdropping strategy that targets Android users. Dubbed “EarSpy”, the attack enables speech eavesdropping by capturing ear speaker vibrations on Android smartphones.
EarSpy Attack Spies On Android Phones
Android phones have long been a lucrative target for hackers and snoopers globally. That is why it is important to discover and remediate various ways attackers can violate users’ privacy before actively exploiting it.
To pursue this prospect, a team of researchers has developed a new eavesdropping attack against Android phones – the ‘EarSpy’ attack.
Briefly, the attack exploits the impact of ear speaker vibrations on motion sensors in an Android smartphone. The researchers observed that most modern Android devices include powerful ear speakers (usually stereo speakers) that generate more sound pressure on the built-in accelerometer than conventional speakers.
Similarly, most modern phones also have more sensitive accelerometers and gyroscopes (motion sensors), which add to the sensitivity to sound vibrations. Therefore, capturing these motion sensors’ vibrations can allow an adversary to decipher the speech (as the following image shows).
The researchers played the word “Zero” six times at a 5-second interval through the ear speakers of the OnePlus 7T handset (which has large dual speakers at the top and a speaker at the bottom). They then captured accelerometer readings (as it does a better job of capturing sound vibrations than the gyroscope. They then repeated this experiment with the OnePlus 3T – an older device with relatively less powerful speakers. The OnePlus 7T produced visibly louder sound vibrations than the 3T.
After that, they developed a MATLAB program to extract different features from the accelerometer data. Eventually, the researchers were able to discover time-, region-, and frequency-domain functions. Further processing of the collected data with machine learning and neural networks allowed them to achieve accurate gender, identity and speech recognition of the speaker.
The researchers have shared the entire setup and other technical information about EarSpy in a detailed research paper.
Restrictions and countermeasures
Despite its sophistication, the EarSpy attack has some inherent limitations.
First, the researchers explained that they could not achieve precise word recognition (they could only detect 45% to 80% of spoken words) due to the built-in volume reduction mechanism of ear speakers. Second, EarSpy’s success also depends on the distance between the ear speakers and the motion sensors in the handset, which varies significantly. Then, the target user’s physical movements can also interact with the motion sensors, which can introduce noise into the captured measurements.
As for countermeasures, researchers first suggested that smartphone manufacturers need to improve the permission model for motion sensors. Restricting permissions can prevent third-party apps from abusing explicit access to multiple sensors. Also, phone manufacturers should consider designing their handsets in a way that prevents such attacks. For example, they can embed the motion sensors at a distance where they suffer minimal ear speaker influence. In the same way, they should consider maintaining a relatively low sound pressure from ear speakers during calls (as in old telephones).
This is not the first study to exploit the effect of speaker vibrations on motion sensors for eavesdropping. In 2019, researchers also presented the “Spearphone” attack that typically focused on the effect of speaker vibrations on the accelerometer.
Let us know your thoughts in the comments.