Fix for whisper-microphone example failure if audio isn't chunk aligned #2645
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
At least on my macOS Sequoia system (MBP 14" 2021, M1 Pro), when I run the
whisper-microphone
example after it has gathered 10 seconds of audio, it fails before the transcription:At least for the audio device I'm using (Airpods Pro Max), there is no guarantee that each audio buffer is a multiple of 1024 samples. Thus at the end of the 10 seconds,
buffered_pcm
can have some samples at the end that do not form a complete 1024 sample chunk.This fixes that by tracking when there is a partial chunk at the end of the buffer, and leaving it in
buffered_pcm
to be processed on the next loop iteration.Note that, in the interest of keeping this PR as small as possible, I didn't make any other changes to this example. That said, I think a good enhancement would be to introduce a
const
for the hard-coded 1024 sample chunk size.