XLingCorrelation

TODO

tests
documentation
group all preprocessing files into one folder :
- all_cha
- select, clean -> ortholines.txt
- phono -> phono.txt
- syllabify -> syllabified.txt
- auto-tags -> auto-tags.txt
- build grammars

Goal

To have an easy to use package to compute correlation between algorithms segmentation and CDI reports describe project

Files

Corpus.py

Handles .cha (or other ? or nothing at all, just clean ortho file ? or just tags.txt ?) Can (or can't) phonologize and syllabify (which languages ? -none for now, except for English some time soon)

Get nb of words, phones, syllables in corpus
Get nb of single word utterances
Get stats on corpus
Store+stats ortho, gold

Segmented.py

Given segmented, ortho, gold

Get dict from phono to ortho (rather build it from CDI ?)
Nb/list of words, syllables, phones
Freq_top, freq_words, write these in files
True pos and all
Evaluation (f-score &cie)
Correct words
Incorrect words
POS tagging ?

Name		Name	Last commit message	Last commit date
Latest commit History 135 Commits
Rdata		Rdata
data		data
experiments		experiments
segmentation		segmentation
tests		tests
tools		tools
xlingcorrelation		xlingcorrelation
.gitignore		.gitignore
README.md		README.md
setup.py		setup.py
specs.md		specs.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

XLingCorrelation

TODO

Goal

Files

Corpus.py

Segmented.py

Model.py

translate.py

About

Releases

Packages

Languages

bootphon/XLingCorrelation

Folders and files

Latest commit

History

Repository files navigation

XLingCorrelation

TODO

Goal

Files

Corpus.py

Segmented.py

Model.py

translate.py

About

Topics

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages