You're probably noticing that the two numbers are not multiples of one another a...

richard_todd · on Aug 16, 2020

Yeah the forced interpolation nags me, even though it shouldn't, but also: if I'm going to encode 128 kilobits per second, I can either use those bits to produce 44k samples or I can use them to produce 48k samples. That's 9% more samples per second that need to come out of the same number of compressed bits. I'm sure there are reasons (like the high correlation between adjacent samples) why that doesn't matter. But, would you resize a 4400px image to 4800px before compressing it to a jpeg? No way, because if you target the same file size either way, you'd encode more bits per pixel from the 4400px original.

neltnerb · on Aug 16, 2020

I haven't read the code but it sounded like they encode in frequency space so if they're already putting all the bits into encoding below 20kHz it seems like it would not change the size (as 44.1kHz to 48kHz already have no bits allocated to it).

richard_todd · on Aug 16, 2020

Since the MDCT is discrete, I assume it operates on power-of-2-sized batches of samples. So (like you, without looking at the code) I would have assumed that more samples/s mean you need more transform blocks, which means you have to allocate fewer output bits per output block to hit your target rate.

neltnerb · on Aug 16, 2020

You are probably right. I forgot about the whole power of two thing for ffts. That would definitely irritate the same part of my brain that would be put off by interpolating discrete samples even if they're inaudible. Same vein as how 7 is more random than 6.