My Take on Music Recording with Doug Fearn

Fletcher, Munson, and You

Doug Fearn Season 1 Episode 93

Send us a text

Our perception of frequency balance varies with loudness, a fact documented by Harvey Fletcher and Wilden A. Munson, two scientists at Bell Laboratories. In 1933, they published a paper called, “Loudness, its definition, measurement and calculation.” It was groundbreaking science in the field of human hearing, and has implications today for how we perceive music. This is especially important when we are mixing.

In this episode, I explain the basics of the Fletcher-Munson Curves and how we can use that knowledge to make better recordings.

email: dwfearn@dwfearn.com
www.youtube.com/c/DWFearn
https://dwfearn.com/

93           Fletcher, Munson, and You                                                           July 8, 2024

 I’m Doug Fearn and this is My Take on Music Recording

 No, there wasn’t a guy named Fletcher Munson. It was Harvey Fletcher and Wilden A. Munson, two scientists at Bell Laboratories. In 1933, they published a paper called, “Loudness, its definition, measurement and calculation” in the journal of the Acoustical Society of America.

It was typical of the pioneering work being done at Bell Labs. The Bell Telephone Company wanted to know everything they could about sound and audio, even if there was no practical application for it at the time -- such as digital audio, where the research would have to wait a couple of decades for the technology to make it practical for the telephone system.

Fletcher and Munson’s work included a series of “equal loudness curves,” which require a bit of interpretation to understand. This was important information for the telephone company, for regular telephone calls and for the ever-growing needs of broadcast radio networks, which used leased telephone lines to distribute their programming.

It was known that our hearing is not equally sensitive to all audio frequencies. Evolution has optimized us for listening to human speech. That makes sense, because communication is a vital requirement for a human society. If you could not communicate with others, you were at a disadvantage in life, and your chances of passing along your genes were diminished. Over a million years or more, our hearing became increasingly better at picking out a voice at a distance, or in a noisy environment.

That’s why our hearing is most sensitive around 3kHz. That’s where the most vital information in speech is concentrated. And that’s why communications circuits, like the old landline telephone system, and modern cell phones cut off the frequencies at 3kHz. There is little useful information above 3kHz. It’s just wasted bandwidth when it comes to talking with someone.

Of course, speech bandwidth limited to 3kHz sounds pretty dull and lifeless to us. But we can understand the words, and are able to recognize the voices we know.

The same applies to the frequencies below 300Hz. The energy below 300 is not needed to understand someone talking. When you cut off the frequencies below 300Hz, you can still perfectly understand the words and recognize the voice. But it does lose a lot of the fullness of a full bandwidth voice. Keeping the bandwidth to the minimum needed makes our telephone system work, whether it is analog or digital.

 

When Fletcher and Munson did their research, they realized just how “un-flat” our hearing actually is. We need a whole lot more sound pressure level, or SPL, at the low end of our hearing range, and at the high end, too.

Basically, at normal conversation levels, you need to increase the level at 20Hz by more than 20dB to sound as loud as a sound at 3kHz. That’s 1000 times louder.

Similarly, at 20kHz, the sound has to be about 12dB louder than at 3kHz to sound like it is the same loudness.

Or, put another way, at typical speech levels, a sound at 20Hz will sound one one-thousandth as loud as a voice. Big difference.

And that is why we use decibels to quantify sound levels. dB is a logarithmic way to measure sound. A 10dB change in level is a 10 times difference in perceived volume. But a 20dB difference is 1000 times difference. Because our ears respond to sound levels logarithmically, we don’t hear it as 1000 times louder. Or softer. That’s why audio level meters are logarithmic. They display the audio level approximately as we perceive it.

But we all know that 20dB is a huge change in level.

 

Added to that non-linearity is another interesting quirk of our hearing. Our perception of loudness across the hearing range varies with the absolute level of the sound.

As sound levels increase, our hearing becomes closer to flat response. At 100dB SPL (very loud), our low-frequency response becomes almost perfectly flat. And it only takes about a 6dB boost at 20kHz to sound as loud as things in the mid-range sound to us.

We know this intuitively. You can’t hear any bass frequencies at all at very low listening levels.

 

If you look at the family of curves that Fletcher and Munson developed, you will see that they are strange. At first glance, they seem to imply our hearing has greater sensitivity to low and high frequencies, but you have to remember that what the graph is showing us is not frequency response, but how we perceive sounds to be equally loud across the audible frequency range.

But even if you inverted the graph to show our hearing frequency response, it doesn’t look much like typical equalizer curves. The steepness of the curve changes with the absolute loudness, of course. But as an approximation, the mid-loudness curve takes on a roughly 6dB per octave curve at the lower bass frequencies. But the high frequency curve is much more of a non-symmetrical peaking curve. That could explain why we often prefer equalization curves that are not based on perfect curves like an electronics engineer might want to see.

 

So, what does this mean? Well, not all that much as a practical consideration for most humans.

But it does have interesting implications for those of us who work in the world of audio recording.

It means the eq you add to a track when you listen to it at a typical average listener level of around 70dB, will sound totally different when you crank the monitors up to 100dB SPL. Suddenly there is a lot more bass, and even the highest frequencies seem boosted compared go how they sound at a lower monitoring level.

Which is correct? Well, they both are. It’s just a difference we have to live with because that’s how our hearing works.

But it does caution us on the use of equalization. The amount you may want to add will depend on how loud you are listening.

Most people in pro audio learn this pretty quickly, probably without knowing anything about Fletcher and Munson. It seems natural to us as we gain experience hearing our mixes in a wide variety of listening situations.

I can usually tell if a mix was made at ear-splitting level. It will tend to sound bass-deficient at a moderate listening level. Fletcher and Munson strike again.

 

Early in my career, I worked as an engineer at a major-market radio station. The engineer was responsible for keeping the levels consistent. There was a limiter that feed the transmitter, of course, but the station sounded best when the limiter was fed an optimum level.

To help make sure the levels were consistent, all the control rooms had fixed-set monitor levels. You could turn it down if needed, but the level control was stepped, had a limited range, and a large white arrow showing the standard level setting.

That made it easy to maintain consistent levels.

I do basically the same thing in my control room. I have a stepped level control with 2dB steps. Although it is adjustable over a wide range, there are certain level steps I try to use consistently.

One is my standard mixing level, which is about 75dB SPL. Another set-point is very low, about 25dB lower. That is useful to hear how the mix will sound as background music. And I have another setting that is around 80dB SPL, which is as loud as I ever want to listen.

Those preset levels are great for mixing. However, when tracking, those settings are not as useful and I set the monitor level to whatever is necessary to hear what I need to hear. Still at a moderate monitor level, however. I may crank it up briefly on occasion, just to check to make sure there is no noise that I wasn’t hearing at a normal level.

If you always mix at the same SPL, your mixes should be very consistent in loudness. It is a great starting point. Of course, I check it with software loudness meter as soon as the mix is coming together. What loudness level to use is somewhat personal, somewhat driven by the type of music, and somewhat a consideration for what a streaming service will do to your levels.

Loudness is a topic for another episode. But in this discussion of Fletcher-Munson curves, it illustrates how you can benefit from being consistent with your monitor levels when mixing.

If you boost the bass to sound good at a low listening level, it will seem much too bass-heavy at a loud level.

Like most things in audio, everything is full of compromises. The optimum approach is to make the frequency content best for the widest range of listeners and environments.

Which introduces the question of whether we should apply equalization at all.

Our electronic equipment is always remarkably flat in frequency response. But the transducers – the mics and speakers – are not. In microphones, we appreciate the non-flat response of many mics for their warm sound, or for their sparkly high-end. Or both. They are not flat, but we chose that mic for how it works in the overall recording.

And an even bigger contribution to the deviation from flat frequency response are the rooms. They include the studio and control room, of course, but also any isolation booths. We know how poorly designed rooms can add their own bizarre “equalization” to the sound. Usually, this is not a benefit.

I find that over the 50+ years I have been recording, I tend to use less and less equalization. I pick mics that offer the tonal quality I want. I avoid boosting bass and high frequencies. That is somewhat a reflection of the type of music I record, but I think it is wise overall.

Using minimal equalization also translates better when Fletcher and Munson come into play for the listener.

 

Most consumer speakers are not very flat. Even the best monitor speakers usually have at least plus or minus 5dB of variation across their frequency range. Home systems can be far worse. And keep in mind that those great-looking graphs of speaker frequency response were made in a way where there was no influence of a real room. The space where the speakers are located will always degrade the flat response.

And I might add parenthetically that I do not like the sound of a monitor system that has been equalized to compensate for the effect of the room. Those systems never sound right to me. I have tried for 50 years to convince myself that equalizing the system for the room was a good idea, but I don’t believe it. And a lot of top engineers I have encountered over the years agree. The better approach is to make the room as good as possible and forget the equalization.

 

And what about the room where someone is listening? That’s an entirely new set of variables. Many rooms will have plus or minus 20dB of frequency response, depending on where you are in the room. Some frequencies will sound way out of proportion, either too loud or too soft.

Often the consumer has tone controls to play with that can dramatically change your careful recording. We can’t control that, and it’s not our job to do so. Every listener will have their own preferences. My observation is that most music consumers want their music to have way more bass content than what we think is right.

Perhaps that is why live sound always sounds like it has at least 15dB more low-end than what I would consider flat response. That’s what an audience wants.

 

Some consumer systems have a button labeled “loudness,” which imposed a sort-of simulation of an inverse Fletcher Munson curve on the music. That was to compensate for the loss of highs and lows at low listening levels. However, it has been my experience that a lot of consumers use that feature all the time, no matter how loudly they are listening.

 

As an equipment designer, I strive to make the frequency response of my products as flat as possible. It is the goal of every designer, whether they are making electronic equipment or a loudspeaker. And yet our hearing has frequency response that is orders of magnitude worse than the equipment we use. How can it still sound good to us?

If we had a piece of gear that was down 20dB at 20Hz, and had a peak in frequency response around 3kHz, we would throw it away. We want our equipment to have flat response. That’s because flat response equipment maintains the original sound of the music, even though our hearing is far from flat.

We have a huge equalizer imposed on our hearing, but since we all hear basically the same way, it is what our brain considers flat response. If the equipment is perfectly flat, it maintains the proper tonal balance as we perceive it.

 

Another related phenomenon is that our perception of pitch changes with the sound level. I first realized this back in the 1970s when almost all pop records faded out at the end. No one wrote an ending to a song. You just repeated the chorus over and over and after a logical amount of time, you slowly dropped the level down to silence. That isn’t done much anymore, which I think is a good thing. A fade-out is sort of an admission that you couldn’t find a way to end the song.

But the fade-out illustrated to me another quirk of our hearing. During a mix, at a fairly high level, the music would go decidedly flat as the song faded out. At first, I thought something was wrong with the tape machine, but it wasn’t that. Our sense of pitch is dependent on level.

Once you hear that, you will notice it even at low monitoring levels. And it is not just something you hear on a song that fades out. More subtle, but still there, is a tendency for the last notes ringing out at the end of a song, is the increasing sense of the pitch going flat.

Some of that could be the overtone content of a stringed instrument, which changes as the notes decay to silence. Both are real phenomenon.

Fortunately, most people never notice this, or their sense of pitch is such that they can’t hear it. But for those of us who work with music daily, it’s another one of those unavoidable imperfections in our hearing.

 

So, what does this mean for you? Probably nothing very important. You already know about the weird frequency response with listening level, whether you are consciously aware of it or not. Our hearing has always had this strange quirk, so if we are paying attention, we know how it changes the perception of listening to music.

But I believe that the better we understand our hearing, the better we can make decisions that affect the final product.

However, if you are disappointed when you hear your work played in a different listening situation, perhaps a better understanding of how our hearing works will help you do a better job.

 

By the way, the curves that Fletcher and Munson came up with were based on averaging out hundreds of test subjects. Everyone is different, so no one curve defines how each of us hears things in the frequency realm.

Over the decades, other researchers have tried to refine the original 1933 curves that Fletcher and Munson developed. Interestingly, even as more test subjects were included, and a more geographically-diverse set of people were used, the original curves that those guys determined in 1933 have changed very little with a more sophisticated approach to testing. That’s good science, and typical of the work that Bell Labs did for decades before its demise in the 1980s.

 

Thanks for the continuing feedback. I appreciate it and helps me determine what people want to hear about.

I can be reached at dwfearn@dwfearn.com


This is My Take on Music Recording. I’m Doug Fearn. See you next time.

People on this episode