This is only true if you’re interested in sampling ultrasonic content into the sonic region. If you’re concerned with re-pitching only audible content (e.g. 20kHz and lower) from the source, there is no mathematical advantage to using 96kHz in re-pitching - complete and perfect reconstruction and thus resampling/re-pitching is possible for all frequencies under Nyquist at any sample rate.

7 Likes

You are me hero. 20 characters of thank you.

1 Like

I believe you are correct, I think I was thinking of time-stretching (while preserving pitch) when I wrote the post.

1 Like

Ah, time-stretching is FFT based in most cases, and in that sense oversampling can lead to more accurate FFT windows and thus a cleaner reconstruction in the inverse transform. However, mathematically speaking it doesn’t matter if you oversample the source or just upsample from the existing data - if the time stretching algorithm is implemented correctly it should be able to get an arbitrarily high-resolution stretch from a 48kHz source as well as from a 96kHz source - again, the mathematics of perfect reconstruction play out here too.

However, some stretching algorithms don’t oversample, and thus you’re effectively doing this work for them by operating at 96kHz. This is not a benefit of 96kHz, though, it’s just a deficiency in the implementation of the algorithm.

4 Likes

I always record at 192 for this purpose (resampling), but produce masters at 48 - out of habit from working in post-production.

Another thing to note is that higher sample rates are improved latency (for the same size buffer) if you are monitoring through your DAW. In cases where the interface has “no”-latency monitoring software (i.e. RME and Universal Audio’s monitoring software), this doesn’t matter (I don’t think).

48k/24bit. Set ‘er there, don’t look back; go forth and make music.

Edit: FWIW, I do all of my work at these settings on a 2013 Macbook Pro. Issues are rare.

2 Likes

Yes, however in most cases the computer works harder servicing the higher sample rate than it does if you use smaller buffers at a lower rate to get the same latency figure. For most general purpose audio computing systems you get the best performance by choosing the smallest practical buffer size at the lowest sample rate. Increasing the sample rate from there usually requires increasing the buffer size as well. Either way, since the system has to compute more data overall, higher sample rates usually push it over the edge of instability.

4 Likes

Makes sense, thanks for the calrification

FWIW 24bit 48kHz has always been the standard for film/TV post production for a very long time - whether for sound design or score/music… Accordingly the default for high rez recording is multiples of ie 96kHz and 192kHz, so that half and quarter speed playback are easily implemented…

in short yes. 48kHz has higher Nyquist Frequency, so it allows more complex signals in the higher parts of the spectrum. It sounds all the same to my ears but in a no-oversampling DSP chain that tends to accumulate “errors”, at 44.1 there are more inaccuracies at the start and at every stage.

Everything upsampled to 96k for mastering here, if not already there. For my own recording projects, everything at 96k too, from start to finish. Easy to do a final SRC down and dither at the end for any required delivery formats.

I did lots of listening tests and my Crookwood converters seemed to like 96k the most. You’d have to do your own tests on your own gear to find what sounds best, but it seems to be true that some converters sound better at some rates (and not necessarily higher ones).

1 Like

Thanks a lot everybody for your opinions and experiences regarding this!

@mzero thanks a lot for those tables! These are really useful, since I do like to work with single cycle waveforms and create my own!

3 Likes

maybe you are interested in this AES paper:
Pras & Guastavino - Sampling rate discrimination: 44.1 kHz vs. 88.2 kHz [pdf] (1.4 MB)

1 Like

Thanks for posting, this was great and confirms some of my own findings.

Also, using higher sampling rates reduces the quantization noise in the audible frequency range. See: http://electronotes.netfirms.com/AN345.PDF

My naive assumption about 48kHz was always that it related to 24fps cinema film.

This Wikipedia page has a nice table of sample rates and where they came from:

48,000 Hz
The standard audio sampling rate used by professional digital video equipment such as tape recorders, video servers, vision mixers and so on. This rate was chosen because it could reconstruct frequencies up to 22 kHz and work with 29.97 frames per second NTSC video – as well as 25 frame/s, 30 frame/s and 24 frame/s systems. With 29.97 frame/s systems it is necessary to handle 1601.6 audio samples per frame delivering an integer number of audio samples only every fifth video frame.[9] Also used for sound with consumer video formats like DV, digital TV, DVD, and films. The professional Serial Digital Interface (SDI) and High-definition Serial Digital Interface (HD-SDI) used to connect broadcast television equipment together uses this audio sampling frequency. Most professional audio gear uses 48 kHz sampling, including mixing consoles, and digital recording devices.

If you do the calculation neither of the look that good… but maybe if you go back to the actually scanline calculation for NTSC it looks a bit better. :man_shrugging:t5:

> 44100 / 29.97

  44100 / 29.97 = approx. 1471.4715

> 48000 / 29.97

  48000 / 29.97 = approx. 1601.6016
1 Like

Don’t believe anyone else answered this yet - as others mentioned 48k is the default for working with the film. The number was chosen because a) it is high enough to avoid the Nyquist frequency issues as previously discussed and b) it’s because film is typically shot at 24 frames per second, so a number was desired that was a multiple of 24. In this case there are 2000 samples per frame of audio.

Edit: @sam beat me too it by a few seconds (also my name is Sam too…)

One other difference between working at 44.1/48k vs higher SRs like 96k or 192k - if you are doing operations in the frequency domain (timestretching, pitchshifting etc - stuff that depends on FFT), having finer resolution data is beneficial.

2 Likes

I wonder how much the downsampling difference mentioned in the paper depends on the actual algorythm. Does anybody have any data on that?

Any my takeaway from this paper is that it’s better to work at 88.2 or 96Khz if you end up doing something like vinyl (assuming they don’t downsample your audio to make the vinyl) and your music benefits from the extra resolution. If you are making CDs just use 44.1K and if you’re doing a digital only release you could also do everything at 48K, but if the benefits of 96K are marginal over 48K I don’t think it’s worth it.

Unless the focus is infra/ultrasound (where you might actually be a bit surprised what you get since it’s stuff you couldn’t hear on the source recording), my understanding was that bit depth would matter more than sample rate in restituting a somewhat accurate image for timestretching / pitch operations, and that the timestretch/pitch algorhythm is the real factor more than khz.

The only thing that’s clear to me from all these talks is 24 bits all the time. The rest seems like really secondary consideration unless doing some very extreme treatment at the limit of the audible frequencies where aliazing might occure.

1 Like