Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Feature Request] Support transcription of audio files larger than 25mb #24

Open
aschmelyun opened this issue Apr 16, 2023 · 2 comments
Assignees
Labels
enhancement New feature or request

Comments

@aschmelyun
Copy link
Owner

OpenAI's Whisper API has a hard limit of 25mb per upload. As it is right now, Subvert doesn't split up uploaded files, and just sends the entire audio file to Whisper.

Will need to implement a way of batching file uploads to prevent exceeding this limit.

This will require refactoring a large part of the video processing section and kind of includes #18 as well.

For now, try to limit uploads to around 22 minutes unfortunately.

@aschmelyun aschmelyun added the enhancement New feature or request label Apr 16, 2023
@aschmelyun aschmelyun self-assigned this Apr 16, 2023
@kuubus
Copy link

kuubus commented May 17, 2023

Same Problem.
To solve the problem, it would be necessary either to transcode to meet the requirements or to split into segments. In case of segments, the most likely error could be caused by the transition from one segment to the next. Another possibility would be to split the audio file into segments with a slight time offset and clean up the transition in the delivered text.

@ma-lalonde
Copy link

A simple workaround in the meantime, if I may suggest, would be to check the audio file size and automatically compress it as needed. I pre-process the audio with the command:
ffmpeg -i video.webm -vn -acodec mp3 -fs 26M output_file.mp3

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

3 participants