WayneMissed this post until now.
I have been doing a lot of measuring and investigation as of late and now I think that I have a pretty good understanding of what is required for good imaging and sound quality. Mind you none of this is actually new, it is completely consistent with what I have belived and used all along, but now I have some more data and experience to better quantify things.
There are two aspects of the sound REPRODUCTION problem (note - I am only talking about reproduction here NOT sound reinforcement or musical instruments). These are the transient response and the steady state response - both defined here as in-room responses. For imaging it is critical that the first 5 - 10 ms of the impulse response be reflection free. That is best done with careful polar design, speaker placement and selective room absorption. The image will be degraded for any sound arrivals between the impulse arrival and 5-10 ms. This first arrival must also be fairly flat in freq resp. or it won't sound narural. Make no mistake about it getting these kinds of impulse responses in real rooms is very very challenging. If there is a floor or ceiling reflection, then the image will be less precise and will tend to move in the direction of the reflection, but early lateral reflections will cause coloration and seveer imaging problems - so lateral reflections are worse than vertical ones.
Now to get a natural sound - in a small room - one also needs for the steady state response to be the same as the direct response - nearly flat. This is measured using a spatial average technique to get a good measurement stability. If these two things are achieved, I have found that the imaging is extremely precise and the overall response is quite natural. The speakers will disappear and only a sound stage remains.
Now when one listens very close, then the reflections are brought down by the direct to reverberant ratio increase and the imaging becomes quite precise, but, IMO, not natural. It sounds as if you have been transported into the recording. This will also occur in a very well damped room that had no room reflections to speak of. In these situations only the transinet response is important because there is no real steady state response.
In a lively room, however, where there is a lot of reverberant energy, things are much more difficult to do, but if done correctly they are much more realistic. It then sounds as if the music was moved into the room with you - not you into the recording. The room adds spatiousness and gives an overall perception of the music being in the same space as the one that the listener is in. The steady state response, as well as the transient response are both critical for this to work properly.
This is my current level of understanding of imaging and naturalness as well as the disappearing speaker trick. The speakers will only disappear in the lively room as they will always be obvious in the transient only situation.