Skip to content

Pre-processing code for Korean Speech Data with Number provided by AI Hub (숫자가 포함된 패턴 발화 데이터)

License

Notifications You must be signed in to change notification settings

duckyngo/Korean-Speech-With-Numbers

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

6 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Korean-Speech-With-Numbers

Pre-processing code for Korean Speech Data with Number provided by AI Hub

Intro

This reposisotry is pre-processing code for Korean Speech Data with Number provided by AI Hub

Korean Speech Data with Number consists of more than 10,000 hours of voice data with 84 categories including numbers in Chinese characters (한자어), native words(고유어) and foreign words(외래어).

Usage

  1. Modify the data_root, data_sets.. options on run.sh

  2. Run 'bash run.sh' to pre-process data

Reference

  1. https://github.com/NVIDIA/NeMo
  2. https://m.blog.naver.com/PostView.naver?isHttpsRedirect=true&blogId=aimldl&logNo=221559323232

About

Pre-processing code for Korean Speech Data with Number provided by AI Hub (숫자가 포함된 패턴 발화 데이터)

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published