Abstract
This paper introduces mmWave-Whisper, a system that demonstrates the feasibility of full-corpus automated speech recognition (ASR) on phone calls eavesdropped remotely using off-the-shelf frequency modulated continuous wave (FMCW) millimeter-wave radars. Operating in the 77-81 GHz range, mmWave-Whisper captures earpiece vibrations from smartphones, converts them into audio, and processes the audio to produce speech transcriptions automatically. Unlike previous work that focused on loudspeakers or a limited vocabulary, this is the first to perform this kind of speech recognition by handling a large vocabulary and full sentences on earpiece vibrations from smartphones. This approach expands the potential for radaraudio eavesdropping. mmWave-Whisper addresses challenges such as the lack of large-scale training datasets, low SNR, and limited frequency information in radar data through a systematic data pipeline designed to leverage synthetic training data, domain adaptation, and inference by incorporating OpenAI's Whisper automatic speech recognition model. The system achieves a word accuracy rate of 44.74% and a character accuracy rate of 62.52% over a range of 25 cm to 125 cm. The paper highlights emerging misuse modalities of AI as the technology evolves rapidly.
| Original language | English (US) |
|---|---|
| Journal | ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings |
| DOIs | |
| State | Published - 2025 |
| Event | 2025 IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2025 - Hyderabad, India Duration: Apr 6 2025 → Apr 11 2025 |
All Science Journal Classification (ASJC) codes
- Software
- Signal Processing
- Electrical and Electronic Engineering