Automated Transcription Pipeline (Whisper cpp)

This repo contains code for the automated transcription of diverse audiofiles. As of writing this it:

Checks audio filename format conformity
Converts (wav, m4a, mp3) into 16Khz single channel wav (combines stereo) via fmmpeg
Removes silence via sox
Transcribes audio locally using whisper.cpp (Mac Metal compatible)
Manages files and folders automatically

Instructions

You should then find a set of self-descriptive directories in your audio folder

You can also run scripts in ./src after transcription to avoid repeated work

Name		Name	Last commit message	Last commit date
Latest commit History 2 Commits
example		example
python_scripts		python_scripts
src		src
whisper.cpp @ 54c978c		whisper.cpp @ 54c978c
.gitignore		.gitignore
.gitmodules		.gitmodules
README.md		README.md
config.yaml		config.yaml
make_dfs.sh		make_dfs.sh
update.sh		update.sh