Akhilesh Bajaj and I discussed this issue briefly last night, and I'd like to see the subject opened for discussion here. Both he and I are in the IT field, and quite familiar with the technologies required to implement either mechanism. But neither of us has really paid a great deal of attention to the number of actual algorithms chosen, other than to notice the debate and read about a couple of the most common types.Personally, I would have felt that using a higher oversampled rate would be best, and then to use interpolation between sampled words. That is my initial gut reaction, without giving it much thought. This is sort of like the simple and obvious idea that higher sampling rates and greater bit density always produce a higher quality and truer representation of the original signal.
But oversampling an existing dataset isn't the same thing as recording with a higher sampling rate. Oversampling simply makes up some steps in between two samples, and fills in the blank. You can "make up" whatever you want to fill the gap, and this, I suppose, is what creative algorithm writers develop. The most obvious one is to use a simple interpolation scheme. But that can be as easily done with an electrical low-pass filter. Integrate the output of the DAC with a capacitor, a simple RC filter.
This makes a great deal of sense, really. A simple one-to-one sample, output through a low-pass filter that smoothes the top end. An anaolog computer, after all, is like a very high-resolution, high speed digital computer. So the simple RC filter is an analog computer than does the interpolation very well.
Of course, the trouble with this reasoning is that the 44.1kHz is very close to the minimum required to reproduce 20kHz. Each half cycle of a 22kHz will be sampled twice, and the DAC translation then comes closer to approximating a square wave. The RC filter would smooth the edges to approximate a sine, but then this requires a very high-order "brick wall" filter to attenuate the harmonics without attenuating the fundamental. First harmonic is only one octave away, and so filter slope must be high and the corner must be placed immediately above 20kHz.
Another trouble with such a low sampling frequency is that of aliasing. Signals in the top-octave will usually not synchronize with the sampling frequency so that some of time, the output representation will skip cycles. If you sample a 20kHz sine signal with a 44kHz ADC, then you will develop a sort of beat-frequency representation of the signal. The first sample may pick up the leading 0.707 point of the first half cycle, and the second sample would then be just before the trailing 0.707 point. The third cycle would be between the falling zero crossing point and the ramp down towards the leading 0.707 point of the second half cycle. The fourth sample would be around the peak of the second half cycle. So the recorded dataset is aliased. Anti-aliasing must be the biggest challenge to CD algorithm developers, and not so much the relatively simple integration filter on the DAC's output.
If data rate and storage isn't an issue, the best thing is to use a very high sampling rate and large word length. The higher, the better. You get to a point where filtering is not required and aliasing doesn't occur. But when the 44.1kHz rate is assumed, then massaging the recorded data with anti-aliasing and other digital processing will probably yield improvements to the output analog signal representation.
What are your thoughts?