Jarina, Roman, Murphy, Noel, O'Connor, Noel E.ORCID: 0000-0002-4033-9135 and Marlow, Seán
(2001)
Speech-music discrimination from MPEG-1 bitstream.
In: SSIP 2001 - WSES International Conference on Speech, Signal and Image Processing, 1-6 September 2001, Malta.
This paper describes a proposed algorithm for speech/music discrimination, which works on data directly taken from MPEG encoded bitstream thus avoiding the computationally difficult decoding-encoding process. The method is based on thresholding of features derived from the modulation envelope of the frequency-limited audio signal. The discriminator is tested on more than 2 hours of audio data, which contain clean and noisy speech from several speakers and a variety of music content. The discriminator is able to work in real time and despite its simplicity, results are very promising.
Metadata
Item Type:
Conference or Workshop Item (Paper)
Event Type:
Conference
Refereed:
Yes
Uncontrolled Keywords:
audio; video; classification; speech; music; signal processing; MPEG;