WHO ARE YOU (I REALLY WANNA KNOW)? DETECTING AUDIO DEEPFAKES THROUGH VOCAL TRACT RECONSTRUCTION
Researchers develop a new mechanism for detecting audio deepfakes using techniques from the field of articulatory phonetics. They apply fluid dynamics to estimate the arrangement of the human vocal tract during speech generation. When parameterized to achieve 99.9% precision, our detection mechanism achieves a recall of 99.5% accuracy. We then discuss the limitations of this approach, and how deepfake models fail to reproduce all aspects of speech equally.
More Coverage on USENIX