Audio Engineering Theses


Non-negative matrix factorization (NMF) is an established method of performing audio source separation. Previous studies used NMF with supplementary systems to improve performance, but little has been done to investigate perceptual effects of NMF parameters. The present study aimed to evaluate two NMF parameters for speech enhancement: the short-time Fourier transform (STFT) window duration and divergence cost function. Two experiments were conducted: the first investigated the effect of STFT window duration on target speech intelligibility in a sentence keyword identification task. The second experiment had participants rate residual noise levels present in target speech using three different cost functions: the Euclidian Distance (EU), the Kullback-Leibler (KL) divergence, and the Itakura-Saito (IS) divergence. It was found that a 92.9 ms window duration produced the highest intelligibility scores, while the IS divergence produced significantly lower residual noise levels than the EU and KL divergences. Additionally, significant positive correlations were found between subjective residual noise scores and objective metrics from the Blind Source Separation (BSS_Eval) and Perceptual Evaluation method for Audio Source Separation (PEASS) toolboxes. Results suggest longer window durations, with increased frequency resolution, allow more accurate distinction between sources, improving intelligibility scores. Additionally, the IS divergence is able to more accurately approximate high frequency and transient components of audio, increasing separation of speech and noise. Correlation results suggest that using full bandwidth stimuli could increase reliability of objective measures.



First Advisor

Wesley A. Bulla

Second Advisor

Song Hui Chon

Third Advisor

Doyuen Ko

Fourth Advisor

Eric Tarr


Audio Engineering


Entertainment and Music Business, Mike Curb College of

Document Type



Master of Science in Audio Engineering (MSAE)

Degree Level


Degree Grantor

Belmont University


audio engineering; Fourier transform; matrix; audio source separation; coding; STFT; speech; psychoacoustics



To view the content in your browser, please download Adobe Reader or, alternately,
you may Download the file to your hard drive.

NOTE: The latest versions of Adobe Reader do not support viewing PDF files within Firefox on Mac OS and if you are using a modern (Intel) Mac, there is no official plugin for viewing PDF files within the browser window.