torsdag 21 november 2013

Theme 3: Research and Theory Pre-reflection

Journal

The Journal of the Acoustical Society of America.
Impact Factor: 1.65

The Journal of the Acoustical Society of America, JASA, deals with sound and acoustics in a broad and interdisciplinary sense. That is, not only structural and architectural acoustics but also musical acoustics, psychology and physiology of hearing as well as speech communication inter alia. JASA has been published since 1929.

Article

Li, Y., & Wang, D. (2007). Separation of singing voice from music accompaniment for monaural recordings. J. Acoust. Soc. Am. 122, 2989 (2007)


In this article Li and Wang proposes a singing voice separation system. It is motivated by its application in automatic lyric recognition and singer identification systems. Also it may improve the understanding of the human hearing system. They focus on monaural recordings (mixtures) as it is a more general case than binaural recordings. Furthermore, Li and Wang shows why a separate system is needed for singing vocals as opposed to speech. Singing often adds an additional formant, has a wider pitch range and is also piece-wise constant. Also, singing is in general accompanied by a harmonic, broad-band interference which is correlated with the singing signal.

The methods and algorithms that Li and Wang use are motivated by relevant previous research by themselves and others. The system consists of three sub-systems. First a singing voice detection classifies portions of the mixture as vocal or non-vocal. Second, the predominant pitch is extracted which contours are used in the final step in a pitch-based separation system. Conclusively, Li and Wang argue that their approach gives good separation of singing voice from the mixture. They point out that improvements on the system may be made primarily in the first part, the singing detection, by using different features of the mixture.

I find Li and Wang's text comprehensible and instructive. It is a significant step forward in a relatively small field of research. Subsequent research refine these methods further.

Gregor & Sutton

1. Briefly explain to a first year student what theory is, and what theory is not.

In my view, a theory can be seen as a general framework. It provides the researcher with causal relationships, systematic reasons of processes and logical explanations. References, data, research subjects, hypotheses etc. can be a part in the formation of a theory but are in themselves not theories.

2. Describe the major theory or theories that are used in your selected paper. Which theory type (see Table 2 in Gregor) can the theory or theories be characterized as?


The main theory of Li and Wang's article is based on the analysis of the nature of voice and their use of relevant methods. With this in mind they propose their system as a general framework. They provide the blueprint for constructing the system and show that it can do a good separation of vocals from a monaural mixture signal.


So, I'd say that this has a little bit of everything of Gregor's taxonomy. My first thought was that it would be theory V as it mainly describes how to design a good vocal separation system. But as it is thoroughly based on analysis and examination and also does a prediction for the outcome of the system it also encompasses the other theories.

3. Which are the benefits and limitations of using the selected theory or theories?



I think Li and Wang has a strong theoretic base. It provides their research with relevant data and methods of designing their framework. However, there seems to be a lack of good standardised evaluation methods within this field (i.e. Sound and Music Computing). A probable cause may be because it is multi-disciplinary, with researchers of very different backgrounds.

Inga kommentarer:

Skicka en kommentar