Recently there has been a greater need to analyze, summarize, and categorize the increasing amount of audio content in the world.Most of this content comes from polyphonic music as mixtures of audio sources. Recently there has been much interest in the analysis of polyphonic music.Analysis results can be in the form of source tracking, where instrument pitch tracks and their weights are estimated from a sound mixture throughout time, or they would be in the form of source separation whereindividual sources are extracted from the mixture. Both problems are addressed in this dissertation.The main problem in the analysis of audio mixtures results from multiple source harmonic frequencies frequently overlapping with each other.Although audio sources are non-stationary, their spectra have a considerable amount ofstructure that can differentiate them from other sources. Recently non-negative matrix factorization (NMF) and probabilistic latent component analysis (PLCA) have been used by many researchers for the analysis of polyphonic audio. They provide good representations of audio mixtures as sums of individual sources.To solve the multiple instrument tracking problem, a hierarchical probabilistic model is proposed as an extension of probabilistic latent component analysis to include parameter estimation of basis spectra and their relative weights for each instrument and their pitches. A pitch-informed NMF based method is proposed to resolve overlapping harmonics in source separation problems. Both methods were trained in advance on example spectra from similar instruments.Both methods were tested on standard datasets, and they were found to outperform several prior unsupervised state-of-the-art methods addressing similar problems.
【 预 览 】
附件列表
Files
Size
Format
View
Methods for multiple pitch tracking and instrument separation from monaural polyphonic recordings