Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Attempting to pass data to the KaldiRecognizer results in an odd internal error #72

Open
TimBoettcher opened this issue Jul 18, 2023 · 4 comments

Comments

@TimBoettcher
Copy link

TimBoettcher commented Jul 18, 2023

I'm trying to integrate vosk-browser into my Rust-based WASM project.

First of, I'd like to note that the API documentation linked in the README could be more precise: I only learned that model.KaldiRecognizer() requires sampleRate as an argument by looking at the source code.

I'm using the AudioRecorder web API to record a MediaStream, converting that to a Float32Array and copying that array into an AudioBuffer, which I then pass to acceptWaveform().

Apparently, the microphone records at a rate of 48 kHz, which seems reasonable to me. But when I actually pass the data to acceptWaveform(), I receive the following error:

ASSERTION_FAILED (VoskAPI:Compute():mel-computations.cc:242) Assertion failed: (!KALDI_ISNAN((*mel_energies_out)(i)))

, followed by another log of undefined.

I'm not sure what this is about, honestly. Any pointers would be appreciated.

@erikh2000
Copy link

Hey, Tim. What happens if you run the examples? Do you get the same error with their calls to acceptWaveform()? Just trying to narrow down the problem, e.g. maybe Vosk always barfs on 48k sample rate.

Note I am not a maintainer. Just another vosk-browser user.

@ccoreilly
Copy link
Owner

Hi @TimBoettcher

there are rust bindings for vosk which would be a better choice for your rust application I guess.

As for your issue, could you share which model you're using and what sampleRate you're passing as a param to the recognizer?

@TimBoettcher
Copy link
Author

@ccoreilly The Rust bindings are for a non-WASM context, though. Since I'm compiling the Rust project to WASM, the bindings wouldn't be particularly helpful, I believe.

I'm dynamically checking for the sample rate provided by the user's media device (via the settings of the MediaStream). In my case, that's a value of 48000, which I pass to the recognizer.

As for the models, I downloaded the models intended for mobile devices from this website, excluding those models that did not comply to the file structure specified in the lib README.

The same error occured with the small models for German and English.

@ccoreilly
Copy link
Owner

I would assume there is an issue with the inputs you're passing to the acceptWaveform method.

Could you share the snippet of code you use to record up to when you feed it to the recognizer?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants