Just a small util tool to convert the cedict_ts.u8 into a JSON or CSV file. Additionals features are:
- Add pinyin with accent based on these rules
- Add HSK level character based fetched on mandarinbean. The HSK7-9 level is parsed from a different website by wohok
- Add zhuyin support based on this conversion rules link
- Add wade-giles support based on this conversion rules link
Clone this project and run one of the cargo command below. If needed I could provided the generate json & csv file.
cargo run -- generate -e ../cedict_ts.u8 -o ../cedict.json -f json
cargo run -- generate -e ../cedict_ts.u8 -o ../cedict.csv -f csv
A small crate which allows to do several operations on the cedict.u8 file but also allows you to do some operations on chinese characters such as:
- Convert pinyin tones to pinyin numbers and vice versa
- Convert pinyin to wade-giles
- Convert pinyin to zhuyin
- Convert a simplified chinese text to tradional and vice versa
- Detect which chinese variant a text is written
use dodo_zh;
use dodo_zh::variant::KeyVariant;
fn main() {
// The KeyVariant can either be Traditional or Simplified chinese
let cedict = dodo_zh::load_cedict_dictionary(path, KeyVariant::Traditional).unwrap();
let wo = cedict.items.get("我").unwrap();
// will return an Item struct
println!(wo.translations);
}
A set of example exist which can helps you to see how to do some pinyin manipulation. Namely convert the pinyin with tone number to a pinyin with tone marker etc...
You can run the example with the following command
cargo run --example dodo