Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Aggregation of multiples audio I/O devices #4

Open
hoch opened this issue Mar 6, 2019 · 5 comments
Open

Aggregation of multiples audio I/O devices #4

hoch opened this issue Mar 6, 2019 · 5 comments
Assignees
Labels
AudioDeviceClient Project label for AudioDeviceClient

Comments

@hoch
Copy link
Member

hoch commented Mar 6, 2019

The one of key advantages of ADC is a single callback function for input and ouptut. This is possible by combining input and output streams and serving them to user. As shown in the example, user can specify two different IDs for input and output respectively.

const constraints = {
  inputDeviceId: inputId,
  outputDeviceId: outputId
};

const client = await navigator.mediaDevices.getAudioDeviceClient(constraints);

It is common that two devices are physically separated. (i.e. different clocks, sample rate and threads) To serve these isolated streams, the system needs to re-clock/sample the audio data before sending them to a callback function. This is so-called "device aggregation" in ADC.

Problem 1. The scope of aggregation

  1. The aggregation should only include 1 input and 1 output devices.
  2. The aggregation should be free-for-all. (multi-input and multi-output)

For the option 2 (which is quite similar to MacOS's aggregate device), we can think of something like this:

const constraints = {
  inputDeviceId: [inputId0, inputId1, inputId2],
  outputDeviceId: [outputId0, outputId2],
};

Problem 2. The configurability of aggregation layer

The aggregation by the system will be involved with many parameters; resampling quality, options for reclocker, speed/quality trade off and etc. Should ADC expose these options at all? Or should we just say this is up to UA? Or should this be somewhere in the middle?

NOTE: @padenot mentioned in TPAC 2018 that FireFox uses this "re-clocking" mechanism to aggregate and align audio data from multiple devices.

@hoch hoch added the AudioDeviceClient Project label for AudioDeviceClient label Mar 6, 2019
@hoch hoch self-assigned this Mar 7, 2019
@hoch
Copy link
Member Author

hoch commented Mar 7, 2019

In the teleconference today, we agreed that multi-input/output aggregation should be provided by OS. The group is in favor of the option 1 from Problem 1. The DJ app use case was brought up as a counterexample, but developers can use multiple outputs to separate audience outs and monitor outs.

For the configurability, the collective thought was to have some sorts of controls, but we have not agreed the degree or scope of it.

@rtoy
Copy link
Member

rtoy commented Mar 8, 2019

Let me also add that we discussed exposing the resampler (if needed) so that the developer can trade-off quality vs latency. There was no decision to do anything about this, but something that we might want to think about.

Also want to give the rationale for doing option 1 from Problem 1:

  • People doing multi-input and multi-output devices are already sophisticated users.
  • Mac OSX already provides OS level ways to aggregate devices into one virtual device.
    • But other OSes may not. We're expecting sophisticated users to be able to get the necessary software to aggregate devices.
  • It simplifies the API and the implementation.

Please correct me if I got these things wrong.

@pmlt
Copy link

pmlt commented Sep 11, 2019

There are several use cases in which an audio device client would be created with only a single input or output device, but not the other. For these use cases, having to pay for the additional latency of clock synchronization without reaping any benefits would be unfortunate.

Any sound-generating application that is sensitive to latency (like a game) will have this issue. These apps rarely need audio input, and if they do, they do not require clock synchronization. They would sooner create two separate ADC contexts, one for input and one for output, if doing so would bypass clock synchronization. Mixing input and generated audio would eventually be done via SharedArrayBuffers and Atomics instead, and only when needed (e.g. when the player enables voice chat in a multiplayer match).

Or perhaps what I'm describing is the 'raw mode' briefly mentioned in the code example? It not entirely clear to me what this feature does.

@padenot
Copy link
Member

padenot commented Sep 11, 2019

It sounds like you want to use two AudioDeviceClients, one for input, one for output.

@padenot
Copy link
Member

padenot commented Nov 14, 2019

One thing that is important and that is not being talked about here, is the fact that browser have to have another IPC boundary between the system audio input/output code and the "content" code, that runs scripts, etc., to be able to properly sandbox "content" code. This is in contrast to native programs that do the audio IO directly.

Aggregating input and output stream, re-clocking in the process that does the audio IO, and doing only a single IPC transaction to the content process is far superior than doing multiple context switches and buffering. Doing so allows using lower buffer sizes, not the opposite: more threads mean more real-time threads and more context switches, which increases scheduling hazard and scheduler pressure, and leads to needed bigger buffer size to have solid audio.

The high level nature of AudioContext and MediaStreams allows easily implementing this today: for example, round-trip latency in Firefox on OSX is limited by the the fact that the Web Audio API requires doing block processing with 128 frames buffers: we're currently sub-10ms round trip on OSX without special hardware, but the limit is arbitrary.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
AudioDeviceClient Project label for AudioDeviceClient
Projects
None yet
Development

No branches or pull requests

4 participants