Join Our Community Today!

Get in touch with us

Pt.4 - Understanding Audio Interface Settings | Sample Rate, Bit Depth, Buffer Size, and Latency

  • person austin knaus
  • calendar_today
  • comment 0 comments
Pt.4 - Understanding Audio Interface Settings | Sample Rate, Bit Depth, Buffer Size, and Latency

4. Understanding Audio Interface Settings

So we’ve discussed why we would need an audio interface. But what exactly is it doing and how does it affect the quality? In this lesson we’ll cover a few important settings for the interface and how they play a role in audio quality. By the end of this lesson you’ll be able to optimize your audio interface settings and understand the effects of changing those settings. Let’s dive In!

_________________________________________________________________

 

  • 4.1 Sample Rate

    • What sample rate is and how it affects sound quality.

Sample rate refers to the number of times per second that an analog audio signal is measured or “sampled” when being converted into digital form. It is measured in Hertz (Hz) or kilohertz (kHz), which represents the number of samples per second.

If your eyes just grew wide, don’t worry, we’ll dive deeper.

Sample rate in simple terms, is how many times per-second the interface takes a sample. We can think of a sample as a picture. Just like how movies are a bunch of fast moving pictures. Sample rate is taking a bunch of pictures of our audio. The higher the sample rate, the more pictures or “samples” are being taken per second. 

_________________________________________________________________

 

Why do we need to sample?

There are two different types of audio signals we will deal with. The first is Analog, the second is Digital. 

The human voice,

the wind,

an acoustic guitar,

the drums.


These are all Analog signals. Microphones are built to pick up analog signals. They then pass through to the audio interface, where they get converted to a Digital signal or, a signal made of binary 1’s and 0’s that computers understand.

 

The computer can only understand binary 1’s and 0’s. Whereas analog signals can produce an infinite amount of variations when it's being played.

So instead of 1 & 0, we have 0.000001 up to 1.000000. This process is called analog to digital conversion or ADC.

_________________________________________________________________

 

For example, and an oversimplification of concept for understanding purposes.

If our sample rate is 5. This means we take 5 pictures a second. If we were to measure an analog signal that took one second to travel from its lowest point to its highest point. Each picture would have an increased number.

First (0.000111) - Second (0.004444) - Third (0.555555) - Fourth (0.888777) - Fifth (1.000000)

If we were to increase the amount of pictures or “samples” taken within that second of time. We would have more numbers between the lowest and highest point. The more samples we take, the more detail we can collect.

Sample Rate of 5


Sample Rate of 22

 

Sample Rate of 44

 

Sample Rate of 88


Lucky for us, sample rates are measured in kilohertz (Kilo = 1,000 / Hertz = times per second), which means thousands of times per second. In fact, the minimum base sample rate is actually 44.1 kHz. In other words, the analog signal gets its picture taken 44,100 times a second, and the audio interface translates those pictures into digital information that the computer understands.

Sample rate means samples per second. Samples are chunks of the audio signal that get converted into Binary.

_________________________________________________________________

 

  • Standard sample rates (44.1 kHz, 48 kHz) and recommendations for application.

Higher sample rates cost more computer processing power. Running at 96kHz gets into some pretty serious computing power. But as you can see below, CD quality is 44.1kHz, so even if you choose 44.1kHz, you’re still in a good spot!

44.1 kHz: Standard for CDs and most digital audio.

48 kHz: Common in film and video production for higher fidelity.

96 kHz: Higher sample rate often used in professional audio recording for improved detail and quality.

192 kHz: Ultra-high fidelity, used in specialized applications but offers diminishing returns for most use cases.

Practical Use:

  • Music Production: 44.1 kHz or 48 kHz is usually sufficient.
  • Film/Video Production: 48 kHz is the industry standard.
  • Archiving and Mastering: Higher sample rates (e.g., 96 kHz) are sometimes used to preserve maximum quality.

If you save a file at 44.1kHz, it will always be at 44.1kHz.

You cannot move the saved file to a higher sample rate expecting more quality.

Once it's saved, it is saved. This is why Masters and archives are recorded at 96kHz and CD’s are at 44.1kHz. You can always down-sample, but you can't up-sample.

_________________________________________________________________

 

  • 4.2 Bit Depth

    • Bit depth and its impact on audio resolution.

Bit Depth defines how finely the samples are measured.

For example, bit depth is like the quality of the camera that's taking pictures of the analog signal. If we took our pictures with an early 2000’s flip phone, we would get the picture but we wouldn't get much detail. If we took the pictures with whatever iPhone or Android is out on the market when you’re reading this, we’d have impeccable quality, depth, and richness of color compared to the 2000’s flip phone.

While sample rate is how many pictures are taken in a second

Bit depth is the quality of those pictures being taken.


Another over-simplified explanation for the purpose of understanding the concept.

If we go back to our sample rate of 5 example and we use a bit depth of 4. This means there are 4 bits per sample.

Each pink dot represents a ‘bit’, and if we connect all the dots our signal would look something like this.

You can see where there should’ve been a smooth curve in the wave, there’s now a kind of blocky or choppy representation of that wave.. The signal gets translated but kind of poorly. 


What if we double the bit depth to 8 but keep the same sample rate of 5? There are now more ‘bits’ of resolution for each sample taken.


You can now see how much smoother the higher bit depth signal is.  There are actually fairly smooth curves where the smaller bit depth chopped them off


You can think of a bit as a portion of a picture. A picture that is made of 16 bits will have less resolution than a picture that is made of 32 bits.

If we are running a sample rate of 48kHz, and we had a bit depth of 16-bit. That means every one of those 48 thousand samples has 16 bits of information. 

If we increased our bit depth to 32-bit. This means that for every one of those 48 thousands samples, there's double the amount of Depth to the information when compared to the 16-bit sample.

Not only does this increase resolution and quality of audio, it increases the size of the file too. Big settings equal big storage space. Don't forget to save to your Master Folder and save often.

_________________________________________________________________

 

Another relatable example.

Take a look at Mario.

8-bit mario only has 1 skin tone.


16-bit mario has 2-3 skin tones.


32-bit mario now has shades because the bits are so fine that we can really see them. But they still aren't defined enough to create fingers.


64-bit mario now has fingers and even ears with visible definition.


You can have 44,100 samples of 8 bit marios, or you can have 48,000 samples of 64-bit marios.

I hope this gives you a good visual representation of how bit depth can affect the quality or ‘resolution’ of the audio samples. Just like sample rate, the higher the bit depth the more computing power required and the bigger the file will be because it contains more information.


Common Bit Depths:

  • 16-bit: Standard for CDs and many consumer audio formats.
  • 24-bit: Common in professional audio recording and production.
  • 32-bit floating point: Used in some high-end production setups.

_________________________________________________________________

  • 4.3 Buffer Size and Latency

    • Buffer size and its role in audio processing.

Buffer size is the amount of samples that your computer is allowed to process before sending it back to the audio interface and out to the speakers. It acts as a temporary storage area, or "buffer," that helps ensure smooth and uninterrupted audio performance. 

You can think of the buffer as a bucket that dumps its contents once it gets full. If your first buffer bucket can only hold 64 samples before dumping the audio to the speakers, and your second buffer bucket can hold 1024 samples before dumping the audio to the speakers. The first buffer bucket will dump its contents 16 times before the second buffer bucket dumps its contents once. This means your computer is working roughly 16 times harder at a 64 sample buffer size, than it is with a 1024 sample buffer size.

When processing audio, the computer needs a small amount of time to process the data (e.g., applying effects, routing audio). Buffer size doesn't just determine how many bits are processed before playback, it also determines how much time the computer takes to perform this processing. Larger buffers give the computer more time to process but it  introduces latency (the time it takes to get from input to output or, the time it takes for your buffer bucket to fill up). Smaller buffers process fewer samples which can reduce latency but it also increases the strain on your computer. This can cause glitches or pops if the system can't keep up. Sometimes it can just freeze or completely crash the DAW because the computer doesn't have enough power to process the information with the amount of time you’ve provided it.

_________________________________________________________________

 

  • Latency and its role in audio processing.

Latency is the total time it takes for audio to enter the microphone, travel to the interface, then be processed by the DAW, sent back to the interface, and then be played through the speakers. This is also referred to as “Round Trip Latency”. It is measured in milliseconds and if long enough, can result in an audible delay from the time you talk into the mic, to the time you hear it played back through your headphones or speakers. This occurs simply because it takes time for the computer to process.

With our buffer bucket example, our 1024 sample buffer bucket takes 16 times longer to fill up than the 64 sample buffer bucket. This difference in time is latency.

Smaller buffer sizes result in shorter latency, which is critical for real-time applications like recording or live performance. This faster processing speed can cause computer crashes if your computer can't handle it.

Larger buffer sizes allow more time for computer processing but induce latency, making them better for mixing or mastering. This allows your computer to relax in larger sessions. Simply because the computer has more time to process the information.

_________________________________________________________________

 

  • Buffer Size and Latency

64 samples: Very low latency (~1.45 ms), but requires a powerful computer.

128 samples: Low latency (~2.9 ms), suitable for recording and monitoring.

256 samples: Moderate latency (~5.8 ms), ideal for balancing recording and processing.

512 samples: High latency (~11.61 ms), better for mixing or mastering when latency is less critical.

1024 samples or higher: Very high latency (~23 ms), considered unusable for live performance or recording purposes.

I’ve personally gone to 2048 in some extreme cases.

_________________________________________________________________

 

In summary,

Sample rate refers to how many times a second the audio interface takes a picture of our incoming audio. Sample rate is measured in Kilohertz (kHz) with a minimum standard of 44.1kHz meaning 44,100 samples per second for CD quality audio.

Bit Depth refers to the quality of each sample. The more bits of information you have in your sample, the more detailed it is. With CD quality being 16-bit, and very detailed audio projects being 32-bit. 

Buffer size is how many samples you will allow your computer to process before it's forced to replay that audio out of your speakers. The smaller the buffer, the fewer the amount of samples your computer is allowed to process before playing them back. This is very CPU intensive because your computer has to think faster and has less time to process. The larger the buffer size, the longer the amount of time your computer has to process the audio before playback providing it more time to process the same information.

Latency refers to the amount of time it takes for your buffer to process or, how long it takes for the microphone to send the audio to the computer, for your computer to process everything, and send it back to the audio interface and to the speakers. The whole round trip.

Latency and buffer size are proportionate to one another. Meaning the lower the buffer rate the lower the latency. The higher the buffer rate, the higher the latency.

The computer's processing power is inversely proportional to the buffer size. Meaning the smaller the buffer size, the more power it takes because it has to work faster. The larger the buffer size the longer your computer has to process things making it easier to handle.

Sample rate and bit depth directly influence quality of audio. While buffer size indicates the number of samples your computer processes before playing them back to you. Latency is the measurement of how long it takes the computer to buffer the samples.

 

 

 

Leave a comment