My Take on Music Recording with Doug Fearn

DSD Digital Recording

Doug Fearn Season 1 Episode 105

Send us a text

Virtually all digital recording uses a format called PCM. But there is another digital format that works in an entirely different way. It’s called DSD, for Direct Stream Digital, and you might find that it sounds better than PCM.

In this episode, I explain what DSD is, mostly from a practical, user, viewpoint.

If it sounds better, why don’t we use it? Well, DSD comes with some serious limitations. I explain those limitations and the techniques used to get around them in the world of contemporary recording.

Most people will find the shortcomings of DSD to be enough of a problem that they have no interest in it. But for those of us who are on a quest for the best sounding audio we can achieve, DSD is worth it despite its challenges.

Here is a link to Native DSD, a digital distributor that has specialized in DSD digital downloads:

https://www.nativedsd.com/

… and a link to Outer Marker Records, the label the Hazelrigg Brothers and I founded a few years go. All the releases on Outer Marker were recorded in DSD:

https://www.nativedsd.com/label/outer-marker-records/

We think DSD is the most musical recording format there is.

 

email: dwfearn@dwfearn.com
www.youtube.com/c/DWFearn
https://dwfearn.com/

Episode 105        DSD Digital Recording                                                     May 31, 2025

 I’m Doug Fearn and this is My Take on Music Recording

(links mentioned in the episode:

https://www.nativedsd.com/

https://www.nativedsd.com/label/outer-marker-records/


 I remember the first time I listened to a CD. It was 1982. A friend who was director of engineering for a Philadelphia FM station brought a CD player over to my studio, along with some CDs and vinyl albums.

We first listened to the CD. I was amazed at how quiet the format was. It sounded pretty good.

Then we played the same recording from a vinyl disc. The difference was remarkable. The vinyl had significant surface noise and the stereo image was not as wide as the CD. But the impact of the sound? No contest. Vinyl won by a mile.

CD quality improved over the years, but it was never very good. Sixteen-bit recordings lack the detail of analog – or higher bit-rate digital, so my reaction could have been predicted.

I had the same reaction with the first digital outboard gear in the late 1970s. Sure, it did amazing things, like delay, pitch change, and reverb. But that was all 16-bit digital and never sounded as good as analog.

Near the end of the 1980s, I thought DAT tape might be useful for some of the classical music recording I was doing, mostly of live performances. I was using my Studer 2-track tape machine for that, and at 15 inches per second, a 10-inch reel only held 30 minutes of music. It was a challenge to change reels during short breaks in the performance. A DAT cassette was good for an hour or more of recording.

I took that DAT machine to a location recording, using it in parallel with the analog tape. Back at my studio, I played them together and did an A/B listening test. Like the CD, it was a disappointment. The analog tape sounded better in every way except wow and flutter, which was excellent on the Studer tape machine, but the DAT recording had no wow and flutter at all. I used Dolby SR noise reduction with the Studer, which resulted in noise level about the same as the DAT recording.

My initial experience with digital audio was disappointing. Analog tape still sounded better. And at my studio, I and my other engineers loved the convenience of digital reverb, but after a month or so, we all independently went back to our EMT plate reverb. It just sounded more natural and pleasant to listen to.

That was my early experience with digital audio. I tried to like it, but it was very much like my reaction to solid-state audio gear versus vacuum tubes. There was something that got on my nerves about solid-state and digital audio. Not exactly the same, but the analog and tube gear have always sounded more comfortable to me.

 

When it became possible to record at higher sample rates, and more importantly, with 24-bit resolution, I was happier with the sound. By the end of the 1990s, I made the shift from analog recording to digital and I was reasonably happy with the sound.

Digital offered a lot, especially in the realm of editing and digital manipulation. I liked that aspect. And converters continued to improve. I was pretty much done with analog recording, and I was OK with the sound of digital – as long as it was 96kHz sample rate or higher, at 24- or 32-bit resolution.

 

And yet, something still didn’t feel right. I found that to make music sound better, I had to use some equalization on the mix buss, generally a dip of about 2-4dB of broad eq between around 400 and 700Hz. That cleaned up the sound a lot. But of course, it also changed the impact of the mix. The diminished mid-range required a change in the balance, or even overdubs of instruments to try to restore that missing middle.

That puzzled me. I never had to do that with analog recording. Maybe my taste was changing, or the recording world had shifted to a new sound. Whatever the cause, I did what I had to. But I was never satisfied with the result.

There was something else: I found that long mixing sessions were very fatiguing. I found my attitude toward the music changed, too. I wanted the mixing to be over and done with. And I really didn’t want to listen to the release version ever again.

There were entire albums of music I really liked, well-performed, and, I thought, recorded pretty well. And yet often I would never listen to it again after my part of the process was done.

Something wasn’t right.

Better converters helped. My approach to recording in digital changed to make it sound better to me. I was working on some really good projects, but I was frustrated by what I was hearing and I wasn’t sure why.

 

It was Geoff and George Hazelrigg that introduced me to the DSD format. They recorded their jazz trio using an early DSD two-track recorder. Listening to their DSD master, I was amazed at how much their recordings sounded like the best analog tape – but without the hiss, wow and flutter, and minus the inherent graininess of tape.

That convinced me to give DSD a try. The two-channel Tascam DA-3000 recorder became available, which could record in DSD as well as conventional PCM. Soon the Hazelriggs and I had 4 of those units between us. For more complex projects, we would use them all, synced together, to give us 8 tracks.

It was also at that point that the major shortcomings of DSD became a challenge. I will get into that shortly.

 

First, some history. Years ago, I read about how Bell Labs developed the theory of digital audio back in the 1920s. Harry Nyquist was a Bell Labs scientist. His name has forever been associated with his theory of sampling rate in digital recording. Nyquist’s research also led to other breakthroughs in digital audio and information theory.

Bell Labs discovered or invented a huge portion of the things we know and use in audio. Their primary purpose was to support the telephone system. Along the way, Bell Labs invented things that are used universally in the world today, such as the transistor.

But they also did a lot of basic research, including schemes for converting analog telephone audio into digital and back to analog. Of course, the technology to actually make that work would have to wait another 25 years. But they had it figured out long before the telephone system implemented the first digital audio for long distance phone calls in the early 1950s.

The scientists at Bell Labs defined the two main schemes for digital audio – the DSD format (although it wasn’t called that at the time), and the PCM format, which is what is used for almost all digital audio today.

After they invented the transistor, digital audio became practical. They could have used either of the formats they invented, but they chose PCM because it was better suited to their application. Telephone audio only has to cover a band from 300 to 3000Hz, which they determined years earlier as the minimum necessary to convey speech that was not only understandable, but the listener could usually recognize the voice of the person at the other end.

PCM stands for Pulse Code Modulation, a term that isn’t too descriptive for us as users.

You probably know this, but PCM works by periodically sampling the audio and assigning a number that represents its level. A CD uses a sample rate of 44.1kHz, which means that 44,100 times a second, a snapshot of the audio is taken. The level is measured and assigned a numerical value. In the case of the CD, that is a number from 0 to about 65,000.

Those numbers are written to a hard drive or some other digital recording system as a binary number.

On playback, the numbers are read back and used to reconstruct the original audio.

65,000 discrete levels sounds like a lot of resolution, but our hearing is exquisitely sensitive. The 16-bit recording lacks fine details from the original analog, real-world sound. It sounds pretty good, but those low-level details are important to our enjoyment of the music. For an easy example, think of the decay of reverberation from a real acoustic space, or a reverb unit. If you want to hear the reverb down to its last audibility, the 65,000 range cuts off before the reverb is truly gone.

Is that important? Probably not a major loss, but apply that loss of low-level detail to all the sounds that make up music and you soon start to lose a sense of realism. Those last rattles of a snare drum, the subtle sounds made by the voice, or the low-level mechanical noises of most instruments are gone. That detracts from our enjoyment of the music. It doesn’t quite sound real anymore.

Increasing the bit depth to 24 brings a lot of improvement. 24-bit audio now has almost 17 million discrete levels that can be specified. That brings back the realism in those low-level details. 32-bit increases that to over 4 billion discrete levels. There is a point of diminishing returns, and I am pretty sure I could not consistently tell the difference between 24- and 32-bit resolution in a simple listening test. More bits can be helpful in other ways, however.

 

Harry Nyquist and colleagues realized that the sample rate had to higher than the highest frequency that you want to hear. The Nyquist theorem says the sample rate must be higher than twice the highest frequency.

Our hearing is generally considered to roll off rapidly above 20kHz, so theoretically a 40kHz sample rate would reproduce the highest frequency we can hear. But is that really true? At 20kHz, a 40kHz sample rate would define the waveform with only two points. That surely is not enough to accurately recreate a complex waveform such as found it music or speech. There are ways to manipulate the data to improve on this, but the bottom line is that you need a much higher sample rate to accurately capture the details of a complex waveform at the highest frequencies of the system. And those deficiencies extend to lower frequencies. After all, 10kHz is only an octave lower, and would be described by only four points at a 40kHz sample rate.

 

And there is another serious problem with this PCM approach. If there is any audio above half the sample rate, the conversion process doesn’t know what to do with that information. That will generate audio artifacts that fall into the audible range. Those sounds weren’t in the original audio. They are just junk that has no relation to the music, which Is heard at a lower frequency range. And those sounds are not harmonically-related to the original audio, so they sound bad.

To avoid this, a low-pass filter is used to abruptly cut out any frequencies above the Nyquist frequency. Problem solved. Or is it?

There are two main things that go wrong with this. First is that no filter is perfect, so it can’t be a “brick wall” where absolutely nothing above, say 20kHz gets through. Sure, it’s highly attenuated, but then our hearing is highly sensitive to this stuff that shouldn’t be there.

And there are other problems with filtering that can affect the sound.

Good filter design tries to mitigate these problems, but it is still less than ideal. And it can be audible.

Higher sample rates minimize this problem. The filters cut off at a much higher frequency, so the higher sample rate requires less severe filters, which sound better.

 

The generation of these unwanted frequencies is called “aliasing.” An alias is a term used to describe an alternative name of someone, usually with an unlawful connotation. That describes the aliasing we hear in PCM digital recording as well. The audio has assumed a new name, one that has no relationship to the original sound.

A brilliant scientist used a common experience to explain aliasing. A movie camera, whether film or digital, takes a series of discrete photos at a particular frame rate. In film, that is 24 frames per second. It’s the same concept as the sample rate of PCM audio.

You have seen aliasing in video. The classic example is in a Western movie with a wagon with large, spoked wheels. It is difficult to not be distracted by the way the rotating spokes look when the wheel is moving. At slow speed, it looks fairly normal. But as the wheel speeds up, something very strange happens. The spokes suddenly stop moving smoothly and can seem to momentarily be frozen in one position. Then the wheel may appear to start rotating backwards.

I know you have noticed this. I remember being baffled by that oddity when I was a kid watching a western movie.

It’s less obvious, but still there, with almost any type of rotating wheel, as long as there is some texture to the wheel surface. It’s true with digital video, too, which usually uses 30 frames per second.

What we are seeing is the frequency of the spinning wheel spokes becoming too fast for the sample rate of the movie camera. It’s aliasing, and it is directly analogous to the aliasing artifacts we hear in PCM digital audio.

As you see when the wheel appears to be turning backwards, it is the inability of the movie system to accurately reproduce a frequency that is too high. Instead, we get artifacts at a lower frequency.

Lots of clever engineering goes into minimizing these aliasing artifacts, in audio and video, but they’re still there in many instances. And those low-level sounds get on your nerves, whether you consciously hear them or not.

 

I know some of this is controversial. Many people claim the artifacts are much too low in level for anyone to hear them. And here is where we get into the brain versus heart aspect of audio I have talked about in many episodes of this podcast.

You can strain your ears and brain as much as you like and you probably will not hear the aliasing artifacts. Maybe they are inaudible after all. But my subjective experience says otherwise. There is something annoying about PCM audio that reduces my tolerance for listening to it.

I think I am more sensitive to that kind of thing than most people. My life has been devoted to eliminating irritants, in recording, in the products I design, and in my daily life. Maybe it doesn’t affect the majority of people. Maybe I am fooling myself. But I don’t think so. I believe the effect is real.

 

And here is where DSD comes in. DSD stands for Direct Stream Digital, a term that might be meaningful to marketing people but doesn’t really describe the process in any meaningful way.

Another term for DSD is 1-bit recording, which actually tells you something useful about the approach.

 

I have talked to several DSD experts, and read a lot of explanations, and, frankly, I am still not sure I fully understand the process. However, there is one simple concept that I am fairly certain that I understand, and that is how DSD takes an entirely different approach to converting analog to digital.

Instead of sampling audio at a certain rate, like 96kHz, and encoding the sample level into a binary number, DSD simply looks at the sample, compares it to the one that came before, and determines if the new sample is higher or lower in level that the previous one.

If it’s higher, it gets assigned a value or 1. If it isn’t, it gets a value of 0.

That’s where the 1-bit term comes from. The only thing stored is that one bit, higher or lower. Nothing else.

In a lot of ways, it is a simpler process than PCM.

If we were to sample the audio at typical PCM sample rates, there wouldn’t be enough data to accurately reproduce the original waveform. We have to sample at a higher frequency. A lot higher.

The SACD format, Super Audio CD, came out in 1999 as an audiophile format that offered much better performance than a CD. It used a sample rate of 2,822,400 Hertz. That’s exactly 64 times the 44.1kHz sample rate of a CD. It never really caught on as a commercial distribution format, but it was an improvement over the CD.

For professional recording, a higher sample rate is used, always in multiples of 44.1kkHz. The next higher is 5.6448MHz, or 128 times the CD rate. That is called DSD128. That’s what the Tascam DA3000 used for its implementation of DSD.

Merging Technologies in Switzerland introduced converters that did both conventional PCM conversion, up to 384kHz, and DSD up to 11.2MHz sample rate, called DSD256.

 

It seems intuitively impossible to see how just storing information about whether the current sample is louder than the previous one or not could work. And it wouldn’t if the sample rate was too low. But in those MHz ranges, you cannot possibly detect that the audio is broken into chunks that only last less than a microsecond.

That’s somewhat like how a Pulse Width Modulator compressor works. The principle is entirely different; PWM compression has nothing to do with digital audio. But chopping the audio in tiny segments in the MHz range makes the audio level manipulation entirely transparent.

 

So how does DSD recording sound? As I said, my first reaction was the lack of annoying artifacts. And maybe that is the complete answer. It certainly would be enough for me to prefer DSD.

But can you hear a difference? Superficially, maybe not for many of us. High-resolution PCM sounds excellent. DSD also sounds excellent. The difference lies, in my experience, with the elimination of the irritants that bother me in PCM.

You might say, “that’s great for people with golden ears, but is that enough to justify using the DSD format?”

I have always believed that the subtle irritants in recording diminish your enjoyment in ways you can’t quite put your finger on. And although most of us work with the best-sounding audio anyone in the world has an opportunity to hear, I can’t help but believe that even the casual listener can be annoyed by things in the sound that shouldn’t be there.

That could be distortion of any kind. Tape and vinyl have inherent distortion that is orders of magnitude higher than what the electronic equipment is capable of. And yet that distortion is not annoying. In fact, it often enhances the sound of the music when used judiciously. I’ve talked about that in previous episodes. The distortion we like fits in with the music and can actually enhance our enjoyment of it.

But if a recording system injects artifacts that are totally unrelated to the music, that’s a problem.

 

I realized early in my recording career that tape machines generated a lot of garbage that wasn’t in the original sound. If you put a pure sine wave signal generator into a tape machine, and sweep it from 20 to 20,000Hz, as the frequency goes up above a few kHz, you will start to hear strange whistles and tones that shouldn’t be there. The higher the frequency, the worse they become.

There are technical reasons why this happens, which I won’t bore you with here. But the pure tone from the signal generator gets corrupted in the tape-recording process. And since all sounds are fundamentally made up of sine waves, the same thing can happen to the music.

We don’t seem to mind those tape artifacts as much as we do digital artifacts, which are somewhat similar in effect. Perhaps it is the inherent noise and distortion of tape that largely masks those sounds.

 

If you do the same frequency sweep in PCM digital recording, the effect it has on us is different. The pure tone of the sinewave just doesn’t sound as pure anymore.

To hear that, you really need a high-quality, lab-grade, analog oscillator. Although pure sine waves are about the most boring sound you can imagine, they are useful to me because they reveal the defects in any recording gear. By the way, digitally-generated sine waves do not sound as pure as a true analog generated sine wave.

 

With DSD, the bottom line for the music consumer is that music lacks artifacts. It just feels better to them. They have no idea why. They might simply say, “I really like this song.” And they won’t get tired of listening to it.

But if listeners are exposed to this constant irritation with music they otherwise like, they will likely come to a point where they decide to just turn it off.

The same thing happens to us in the studio. We likely will hear the same song many more times during the course of its production than most listeners ever will.

Has it ever happened to you that you were working on a song that you really liked but quickly got sick of hearing it? It’s the same principle as the listeners reaction. We are simply annoyed by the sound in ways we don’t understand.

 

Here’s a real-world example.

I found that when I started recording in DSD, my sessions seemed much more pleasant and relaxed. I could go all day and not get annoyed at the sound. In fact, I liked the way it sounded so much that I would often go home after a long session and listen to what we recorded, sometimes multiple times. I can never recall doing that with PCM digital sessions. I would do that occasionally back in the days of analog tape.

But with DSD, I almost always want to hear it over and over.

 

OK. Perhaps you are now intrigued enough to want to try DSD for yourself.

But I have to now tell you about the downsides of DSD. Some of these things are entirely counter to the way we are accustomed to working in PCM digital recording. Many people will conclude that the shortcomings of DSD are not worth the major change in workflow.

Let me explain.

You cannot manipulate a DSD recording in any useful way. The DSD format does not allow you to mix or modify the tracks in the native DSD format. A DSD recorder is much like a tape machine in that sense.

You cannot add eq, compression, delay, pitch correction, or any other effect in the DSD format.

In order to include those production tools we use often, you have some restrictive options. You can convert the DSD files to PCM for mixing, or do all the mixing and modifying in the analog realm, or add as much of the processing as possible with analog outboard gear while you are tracking.

Big problem. It’s very frustrating.

So, you are probably saying, if I have to convert my DSD tracks to PCM to finish the project, won’t I be just where I would be if I just did the project in PCM from the start?

That was my original conclusion. But what I found was that there was something almost magical about the DSD original capture that was to a large degree preserved through the subsequent PCM conversion. That does not make logical sense to me, but I know it is a real thing. A PCM mix derived from DSD capture still sounds better than starting out in PCM.

I am not alone in this observation. And I have found that there is an advantage to DSD capture all the way down to a MP3 or similar data-compressed format. Most listeners will hear your recording from some sort of data-compressed file. I am convinced it still sounds better than if the original recording was made in PCM digital.

Perhaps it is the pristine simplicity of the DSD capture that makes every subsequent format conversion sound better. I don’t know.

If your workflow requires plug-ins, pure DSD recording may not be for you. If you need automation for your mixes, pure DSD may not be for you.

Another factor to consider is that you can only record as many DSD tracks as you have converter channels. In my case, that is 16 channels. If I need more than 16 tracks, I am out of luck.

When I need more than 16 tracks, my only option is to convert the DSD tracks to PCM in a new session. Then I can add as many additional tracks as I need. But they are not new DSD tracks, of course. I make sure that all the featured instruments and voices are on those 16 DSD tracks.

Often, I do not need more than 16 tracks for the projects I do, so it’s not a major problem. Perhaps I will add another 8 tracks with an additional converter at some point.

 

What about mixing in analog? I have always tried to avoid putting the audio through any more electronics than is strictly necessary. Using a mixing console seems like a major step backwards if you are trying to keep the original audio quality as high as possible. Like most things, it is a tradeoff. You have to decide if that analog mixing is worth it or not.

I won’t go into my feelings about solid-state electronics, which I have talked about in many episodes and you are probably tired of hearing me say it.

Our record label, Outer Marker, records everything is DSD, no exceptions. For about half of our albums, the track count is so low that we can mix using an analog passive mixer we built for the purpose. It uses a VT-2 mic preamp as the summing amplifier. The line-level output of the VT-2 goes back to the DSD recorder as a new pair of DSD tracks.

If we have to add reverb, we record the reverb on new DSD tracks before mixing. If we need to manipulate the tracks with eq or compression, we do that as an analog insert either on individual tracks, or on the mix buss, using outboard gear.

The result is about as pure as the original tracks.

For more complex projects, I convert the original DSD tracks to PCM and mix in the digital realm, the same as most records are made today. It is a compromise, and I wish I did not have to do that, but the final product needs to fulfill my goals, and currently that is the only way to get what I want. I generally us 192kHz sample rate with a 32-bit depth. That is the highest resolution my system allows with full access to plug-ins and automation.

 

The projects I do generally do not require much editing or punching-in. Those two techniques can be done in the DSD format. It’s not much different than what we do in PCM. Re-assembling a composite track from multiple takes or from punched-in segments works fine if you are mixing in analog. But a conversion to PCM means finding all the original DSD files, converting each to PCM, and manually assembling and sync’ing them in the PCM mix. I find that more trouble than it is worth, unless there is no other option.

 

In order to make edits it is necessary to use a DAW that can handle DSD editing. We use Merging Technologies’ Pyramix for DSD recording. During the duration of the edit transition, Pyramix converts DSD to a very high resolution PCM format. It then reconstructs the edit in DSD. The duration of the edit crossfade is generally milliseconds at most, and the edit is imperceptible.

 

I dream of the day when DSD is as easy to use as PCM recording and mixing. The challenges to make that possible are seriously daunting. I have talked to several experts who tell me that they could make it work, but it would require years and millions of dollars. That kind of commitment is unlikely for a format with so few users.

It is a catch-22. If there were more users, it would be practical to develop a system to do what we need. But we won’t get enough users until such features are available.

I know from talking to many recording engineers that they are attracted to using something better than PCM digital. The interest is there. The practical implementation is not there for most complex projects.

I put up with the frustrations and shortcomings because the improvement in the sound is worth it to me. Most people will come to a different conclusion. If you ever have an opportunity to hear DSD, I think you should give it a listen. You might find that you can hear the difference. You might not at first, but with repeated listening to the same piece of music, you may suddenly come to realization that DSD does indeed sound better, has fewer irritants in the sound, and it just makes you smile.

 

You can hear quite a few examples of recordings done in DSD by going to Native DSD. The link is in description. They have hundreds of albums available for download in DSD. You will need some sort of DSD player to hear them.

Outer Marker has albums in a range of genres. All are recorded in DSD. Some are simple enough that no mixing was required. They are pure two-track recordings. Others were mixed with our analog passive mixer. And some required the conversion to PCM for mixing.

I would be interested in hearing what you think.

 

Thanks for listening, subscribing, and commenting. You can reach me at dwfearn@dwfearn.com

 

This is My Take on Music Recording. I’m Doug Fearn. See you next time.

People on this episode