Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix for whisper-microphone example failure if audio isn't chunk aligned #2645

Conversation

anelson
Copy link
Contributor

@anelson anelson commented Nov 27, 2024

At least on my macOS Sequoia system (MBP 14" 2021, M1 Pro), when I run the whisper-microphone example after it has gathered 10 seconds of audio, it fails before the transcription:

Error: Insufficient buffer size 384 for input channel 0, expected 1024

At least for the audio device I'm using (Airpods Pro Max), there is no guarantee that each audio buffer is a multiple of 1024 samples. Thus at the end of the 10 seconds, buffered_pcm can have some samples at the end that do not form a complete 1024 sample chunk.

This fixes that by tracking when there is a partial chunk at the end of the buffer, and leaving it in buffered_pcm to be processed on the next loop iteration.

Note that, in the interest of keeping this PR as small as possible, I didn't make any other changes to this example. That said, I think a good enhancement would be to introduce a const for the hard-coded 1024 sample chunk size.

At least on my macOS Sequoia system (MBP 14" 2021, M1 Pro), when I run
the `whisper-microphone` example after it has gathered 10 seconds of
audio, it fails before the transcription:

```
Error: Insufficient buffer size 384 for input channel 0, expected 1024
```

At least for the audio device I'm using (Airpods Pro Max), there is no
guarantee that each audio buffer is a multiple of 1024 samples.  Thus at
the end of the 10 seconds, `buffered_pcm` can have some samples at the
end that do not form a complete 1024 sample chunk.

This fixes that by tracking when there is a partial chunk at the end of
the buffer, and leaving it in `buffered_pcm` to be processed on the next
loop iteration.

Note that, in the interest of keeping this PR as small as possible, I
didn't make any other changes to this example.
@LaurentMazare LaurentMazare merged commit 23ed8a9 into huggingface:main Nov 27, 2024
10 checks passed
@LaurentMazare
Copy link
Collaborator

Thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants