Introduction
At the February 2000 NAMM show, Cakewalk invited representatives from
Microsoft and over 30 major hardware and software vendors to the first
annual "Windows Professional Audio Roundtable." The purpose of the roundtable
was to work together towards solutions that will make Windows the ideal
platform for professional audio. This paper presents the results of
the roundtable discussions.
Latency: What’s Required vs. What’s Possible
The most important performance criterion of a DAW is latency, i.e.,
the delay between when the software changes a sound and when that change
is actually heard. Latency effects the overall responsiveness of a DAW’s
user interface to input gestures as well the applicability of a DAW
for live input monitoring. The present trend towards software synthesis
also highlights the influence of latency on the playability of a software-based
instrument. Unfortunately, latency happens to be exact place where external
factors influence performance the most.
How
low must latency be? A skilled audio engineer can hear subtle differences
in the "feel" of a drum recording simply by moving a microphone 1 foot,
a distance equaling 1 msec of delay. Studies have shown that humans
can perceive interaural (stereo) differences as low as 10 usec (0.01
msec). Obviously, lower is better.
What’s the best we can deliver? Despite claims by hardware and software
vendors, no one has ever scientifically measured audio latency in a
DAW. However, we do know for certain that there are three hard limitations
that put a fixed lower bound on the latency that a host application
can deliver.
1.The DAC’s and ADC’s in a sound card have some delay inherent to them.
Typical converter latency is in the range of 30-50 samples, which
represents about 1-1.5 msec of delay at 44.1 kHz.
2.The host operating system (Win9x, NT or 2k) will introduce interrupt
latency, a delay between when a hardware interrupt occurs and when
the lowest levels of the driver receive control. Interrupt latency is
a fundamental measure of an operating system’s performance and is not
a factor that is open to optimization.
An analysis of interrupt latency in Windows was presented at OSDI’99
by Erik Cota-Robles and James P. Held (http://www.intel.org/ial/sm/usenix).
Their results show that the best case latency on Win9x or WinNT is about
1 msec, and that the worst case (on Win9x) can be as long as 100+ msec.
3.The scheduler in the host operating system leads to unpredictable
timing when an application (user mode) thread needs to be woken up for
audio streaming tasks. With clever design this can be made more predictable,
so for argument’s sake we’ll neglect this limitation.
When you consider the effects of converter latency and interrupt latency,
it becomes clear that the lowest latency you can ever hope to achieve
under Windows is about 2 msec. In reality, the influence of system load
on interrupt latency and the scheduler will lead to inconsistent performance
(manifested by random audio drop-outs), so in most practical cases the
audio latency will be much higher.
For real-world usage scenarios, minimizing the uncertainty that arises
under heavy system loads is tantamount to reducing audio latency. Since
WinNT (and Win2k) have tightly bounded interrupt latencies, these platforms
should be better suited to the task of audio streaming. We believe an
obtainable target for audio latency under Win2k is 5 msec, even under
heavy system loads.
Software and Hardware Development
Observations and Conclusion
Software vendors face a daunting set of challenges. Customers demand
the lowest latency possibly, but delivering this requires knowledge
of O/S issues that are neither well documented nor well understood.
As demonstrated by the WavePipe™ technology introduced in Cakewalk Pro
Audio 9, it is possible to get low latency out of standard drivers,
but this is still very much dependent on the quality of the driver.
Hardware vendors are challenged even further. On the Windows platform,
there are a variety of driver models to consider: VxD, NT drivers and
WDM. On top of these drivers live a multitude of user-mode APIs: MME,
DirectX, ASIO and EASI.
Audio hardware vendors are writing too much code to support too many
driver models and too many APIs. As a result, driver performance is
suffering overall.
Consider the steps a hardware vendor takes when planning which drivers
to build:
1.Choose a user-mode API: MME, DirectX, ASIO or EASI.
2.Choose a target operating system: Win9x, WinNT.
3.Develop the kernel mode component (.VxD or .SYS), utilizing the Microsoft
DDK for the chosen operating system.
4.Develop the user-mode component (.DRV or .DLL) component to support
the API.
Page
1 2 3
4