 
  Across Acoustics
Across Acoustics
Lead Vocal Levels in Pop Music
Is there an ideal level for lead vocals compared to accompaniment in popular music? Researchers at University of Oldenburg investigated this question by analyzing the Billboard Hot 100 year-end list from 1946 to 2020 as well as Grammy award nominees from 1990 to 2020. In this interview, author Kai Siedenburg discusses what his group learned about an important aspect of music mixing and the impact the research may have.
Associated paper: Karsten Gerdes and Kai Siedenburg. "Lead-vocal level in recordings of popular music 1946–2020.“ JASA Express Letters 3, 043201 (2023). http://doi.org/10.1121/10.0017773
Read more from JASA Express Letters.
Learn more about Acoustical Society of America Publications
Music: Min 2019 by minwbu from Pixabay. https://pixabay.com/?utm_source=link-attribution&utm_medium=referral&utm_campaign=music&utm_content=1022
Kat Setzer 00:06
Welcome to Across Acoustics, the official podcast of the Acoustical Society of America's Publications Office. On this podcast, we will highlight research from our four publications. I'm your host, Kat Setzer, editorial associate for the ASA. Today we'll be talking about a very recent article, “Lead vocal level in recordings of popular music from 1946 to 2020," which is in the April 2023 issue of JASA Express Letters. I'm speaking with Kai Siedenburg of the University of Oldenburg, one of the authors on the article. Thanks for taking the time to chat with me today, Kai! How are you?
Kai Siedenburg 00:42
Good. Thanks for having us.
Kat Setzer 00:44
Yeah, very excited. So first, tell us a bit about your research background.
Kai Siedenburg 00:49
Yeah. So in the music perception and processing lab at the University of Oldenburg, we basically study how listeners make sense of musical sound. And we're especially interested in music psychoacoustics and how hearing impaired listeners perceive music. To study these questions we conduct of course, acoustical analysis, and do psychoacoustics experiments, we develop auditory perception models, and also design music processing algorithms. So that's sort of the realm that we're working in.
Kat Setzer 01:19
That's very nifty. In this article, you're talking about lead vocals in music specifically. Can you explain how lead vocals are usually treated in audio mixing? Or in the music we hear?
Kai Siedenburg 01:29
Yeah, so, I guess there are a few important ingredients, like leveling, adjusting the right level of the lead vocals. And this is the parameter that we look at specifically in our paper. But there's also of course, dynamic range compression, cueing, reverberation, overdubbing (so the overlaying of several tracks), these are important ingredients. I'm not sure whether there's a standard recipe. I mean, after all, mixing is really an artistic practice to reach a particular signature sound. I guess producers and mixing engineers certainly make use of very diverse and creative approaches.
Kat Setzer 02:07
Yeah, totally. Mixing is actually part of the, you know, music creation process in a way.
Kai Siedenburg 02:12
Exactly.
Kat Setzer 02:13
So what is the lead-to-accompaniment ratio, and why is it important?
Kai Siedenburg 02:18
The lead-to-accompaniment ratio is actually a very simple measure, it's just the level difference between the lead vocals and the accompaniment in a musical mix. And we measure that in decibels, so in dB, and we use A weighting before the calculation of that measure to emphasize frequency regions where human hearing is more sensitive and where most of the vocal frequency content is located.
Kat Setzer 02:46
Okay, so you talk about the "loudness war" in your article. What is the loudness war, and what kind of research was done in relation to the loudness war?
Kai Siedenburg 02:56
The loudness war is a really interesting development in the time period that's also covered by our paper. It refers to what has been seen as a some sort of competition between music producers fighting for the attention of the audience, so to say. So the rationale was that louder records would sell better. And this actually lead to some sort of arms race in terms of record level. So record producers then basically compressed the dynamic range of their records, and thus, they were able to increase the overall level. But this excessive use of compression led to records around the year 2000 that sounded really flat, lifeless, and really unnatural. So there were not the same acoustic amplitude fluctuations as in natural sounds. So for that, they were also very much criticized. And yeah, with regards to previous research, there was, for example, a 2019 study by Michael Howeth showing that fluctuations in the bass frequencies went along with this development. So they were associated with this increase in overall level. Something we really don't know is whether vocal level changed over time, and this is essentially the starting point of our study.
Kat Setzer 04:18
Okay, got it. So basically, like the music was, it used to have more variations between like, the quiet and the loud, and now then it became less difference between the quiet and the loud so that you can make everything louder. Is that kind of the idea?
Kai Siedenburg 04:33
Exactly. Yeah. So I mean, multiple cues contribute to our perception of loudness. So if you play a piano, very softly, also, you will hear changes in timbre. That's why you can compress the amplitude levels and still hear differences also in terms of perceived loudness or the sense of dynamics, but still the sound in terms of its level is very, very high. And that's sort of the underlying phenomenon of this loudness war.
Kat Setzer 05:07
Okay, got it. So what's known with regards to lyric intelligibility and sound level of the lead vocals in a song?
Kai Siedenburg 05:15
So to my knowledge, there's actually not much work on lyrics and intelligibility in music. Previous research has shown that there is a relation between loudness and intelligibility of lyrics. But there hasn't really been any detailed measurement. And this is a big contrast to the field of speech intelligibility, where we know a lot about intelligibility. And we, for example, know that around -7dB, speech-to-noise ratio, speech understanding gets really difficult, so people with normal hearing will be able to understand less than every second word. And for music prior to our study, we didn't really have a baseline of the levels at which vocals are mixed. And that's where we started.
Kat Setzer 06:08
Okay, got it. So we didn't really know like, how much difference between, like, the background music and the vocals you need to be able to understand the lyrics, essentially,
Kai Siedenburg 06:19
Exactly.
Kat Setzer 06:20
Okay.
Kai Siedenburg 06:21
Also, we didn't really measure lyric intelligibility in our study. But still, we obtained a baseline for a representative set of songs, that probably is going to be valuable for future research.
Kat Setzer 06:37
Okay, so you weren't looking at intelligibility, like you said, but you did have a couple of the hypotheses that you present in this paper. So can you explain both of these?
Kai Siedenburg 06:45
Yeah, so there were two competing hypotheses that we tried to differentiate empirically. So the first one would say that the lead-to-accompaniment ratio stays more or less fixed in music production, regardless of the time period that you're in or the genre that you're looking at. So this would mean that to guarantee intelligibility of the lyrics on the one hand, and the audibility of the accompaniment on the other, there always needs to be a specific value of this lead-to-accompaniment ratio. So this would be the one hypothesis and the alternative would be that the lead-to-accompaniment ratio undergoes changes over time that reflect stylistic and aesthetic trends. So this would mean that it's an aspect of mixing that is really subject to aesthetic development.
Kat Setzer 07:35
Okay. So what did you actually do to test these hypotheses?
Kai Siedenburg 07:39
Yeah, it was a pretty straightforward approach, we measured our lead-to-accompaniment ratio across time and across different genres. We essentially took a little bit more than 700 songs, used commercially available source separation software to separate the lead vocals from the accompaniment. And we looked at two different sets of songs: the first one being the Billboard Hot 100 Year End list, and the second one being Grammy Award nominations from five different genres. So we looked at country, rap, pop, rock, and metal.
Kat Setzer 08:18
Sounds very thorough. So your first experiment, which was regarding validating your source separation based estimate of the lead-to-accompaniment ratio, what did you end up finding in there?
Kai Siedenburg 08:29
So in our first experiment, we actually needed, as you say, to validate our approach to be sure that using the source separation software we get valid results. And what we found was that we get an error of around 1.5-1.6 decibels for songs with backing vocals and an error below one decibel, for songs without backing vocals. And we figured this error was relatively small compared to the range of the lead-to-accompaniment ratios that we observed in our database, which was around 15 DB. And thus, we were convinced that it would be a valid way to go ahead with this approach.
Kat Setzer 09:15
Okay, so then what did you end up finding with regards to the lead-to-accompaniment ratio over layers?
Kai Siedenburg 09:21
Yeah, we looked at the lead-to-accompaniment ratio in the Billboard 100 charts. And we found that starting in the year 1946, this measure decreased up to the year in 1975. nd from there on, it's more or less stayed constant. So there's this turning point and this aspect of mixing around mid 1970s, where things started to stabilize.
Kat Setzer 09:49
Why do you think that happened?
Kai Siedenburg 09:50
Yeah, that's an interesting question. I mean, there are various factors playing into it. Of course, music technology changed drastically during that time. You have electric guitars coming up, you have multitrack recording coming up. An important aspect is stereophonic mixing. In psychoacoustic terms, this allows for spacial release from masking. So you can position multiple musical sources in a mix of different spatial locations, and then still listen to the vocals, for example, in the center, other sources are panned towards other, towards the side, for example, and then due to the spatial aspect and spatial separation, it seems plausible that the other sources have higher levels compared to the vocals, and you're still able to hear the vocals, because you have the spatial release from masking.
Kat Setzer 10:45
Okay.
Kai Siedenburg 10:45
Of course, you also have musical factors playing into this. So musical genres, of course, constantly evolved during that time, right? So in the 1950s, you have more country style music represented in the Billboard Top 100, that's sort of light accompaniment, and then more heavy and louder electric bands in the 1970s, of course. So that's also something that plays into this.
Kat Setzer 11:16
Right, that all makes sense. So what did you learn with regards to lead-to-accompaniment ratio with regards to musical genre?
Kai Siedenburg 11:25
Yeah, so we looked at musical genre, then from 1990s onwards. And actually, we found quite strong differences. So we found that country had higher ratios compared to pop and rap. And then rock had lower ratios compared to rap, but even higher ratios compared to metal, so metal was at even negative lead-to-accompaniment ratios. There was one interesting thing and of course, the role of guitars in the genre metal is an important factor there because one could argue that the guitars sort of take on or take over the the prominent status of the vocals in metal, or the guitars are as important as the vocals. Another interesting finding was that solo artists had higher lead-to-accompaniment ratios compared to band-based artists. So it seems it really makes a difference whether there's a group of people in the mixing room sort of fighting for attention, or whether it's just the producer and the lead singer together.
Kat Setzer 12:32
Right, right. You don't want intraband drama.
Kai Siedenburg 12:35
Exactly.
Kat Setzer 12:36
So why are these findings regarding the lead-to-accompaniment ratio important and research?
Kai Siedenburg 12:41
Well, we find a way to acoustically describe music mixing and its development over time. And as I already said, we provide a baseline measurement of an important parameter in music production that will be valuable for future research in this domain. And more generally, I think our results demonstrate that mixing is not just a technical process, but fully intertwined with aesthetic intentions. And that it depends on genre. And it depends on whether records are made for solo band artists. I think these are the major implications.
Kat Setzer 13:16
Yeah, do you have any more research planned with regards to popular music?
Kai Siedenburg 13:19
We actually have quite a few research projects working with popular music these days. So it's our standard model these days. We have one project looking into vocals and popular music, and trying to figure out the acoustical basis for why they are able to attract auditory attention. And this is regardless of level. So regardless of level relationships, somehow vocals in popular music capture attention of listeners. And specifically, we're looking at the role of frequency micro modulation there. So these very fine-grained pitch changes you find in vocal sounds that you normally don't find in instrumental sounds. We're also working on a new test of musical scene analysis abilities, where we use mixtures of popular music. This essentially looks at the question, how good are people at hearing out individual instruments from mixes of popular music? And what's an efficient way to measure this? So we want to eventually use this to characterize how well hearing-impaired listeners perceive music. And we also have a study that looks into remixing popular music for hearing-impaired listeners.
Kat Setzer 14:34
Yeah, that sounds really cool. Sounds like you could eventually improve how hearing-impaired listeners hear music and thus get more enjoyment out of it, ideally.
Kai Siedenburg 14:42
That's one of our long term goals. Exactly.
Kat Setzer 14:45
So one question that our editor in chief had for you that is outside of the popular music realm was are there any studies on classical music and orchestra solos and concertos?
Kai Siedenburg 14:55
It's a great question. With classical music, it's actually much harder to get clean source material, because if you tell classical musicians to record a Beethoven symphony in the anechoic chamber, they will tell you that this is really a difficult job to do. I mean, there are very few examples of this out there, but it's really a difficult terrain. So to get at this we're collaborating as part of the ACTO project, which stands for analysis, creation, teaching of orchestration, which is a multinational project based in Montreal. We're collaborating with Felix Burrier, who is the founder of the ArtPlace software. And this helps us to generate clean source material. And we've actually just set up the Tristan Prelude by Richard Wagner, in the lab, so using spatial acoustics, and are about to start perceptual experiments on how well we are able to identify musical instruments in mixes of classical music. So yeah, everybody interested in classical music, please stay tuned.
Kat Setzer 16:05
That's awesome. A little bit of research for everybody out there.
Kai Siedenburg 16:09
Yeah.
Kat Setzer 16:09
Well, thank you again for taking the time to speak with me today. It's funny because I listen to pop music all the time, and I probably have listened to a lot of the songs that you've studied, but it never occurred to me how songs might be mixed so consistently across genres or in terms of vocals versus instruments, or inconsistently, you know. I bet our listeners will enjoy learning about it as much as I did, and I wish you good luck in your future research.
Kai Siedenburg 16:29
Thanks, Kat, it was a pleasure.
Kat Setzer 16:32
Thank you for tuning into Across Acoustics. If you'd like to hear more interviews from our authors about their research, please subscribe and find us on your preferred podcast platform.