diff --git a/README.md b/README.md index 0d4bd38..b0da8ca 100644 --- a/README.md +++ b/README.md @@ -4,14 +4,9 @@ [![PyPI](https://badge.fury.io/py/tensorflow.svg)](https://badge.fury.io/py/trtokenizer) TrTokenizer is a complete solution for Turkish sentence and word tokenization with extensively-covering language -conventions. - -If you think that Natural language models always need robust, fast, and accurate tokenizers, be sure that you are at the -the right place now. - -Sentence tokenization approach uses non-prefix keyword given in 'tr_non_suffixes' file. This file can be expanded if +conventions. If you think that Natural language models always need robust, fast, and accurate tokenizers, be sure that you are at the +the right place now. Sentence tokenization approach uses non-prefix keyword given in 'tr_non_suffixes' file. This file can be expanded if required, for developer convenience lines start with # symbol are evaluated as comments. - Designed regular expressions are pre-compiled to speed-up the performance. ## Install @@ -42,4 +37,9 @@ word_tokenizer_object.tokenize() - Release platform specific shared dynamic libraries (Done, build/tr_tokenizer.cpython-38-x86_64-linux-gnu.so, only for Debian Linux with gcc compiler) - Limitations -- Prepare a simple guide for contribution \ No newline at end of file +- Prepare a simple guide for contribution + +## Resources + +* [Speech and Language Processing](https://web.stanford.edu/~jurafsky/slp3/) +* [Bogazici University CMPE-561](https://www.cmpe.boun.edu.tr/tr/courses/cmpe561) \ No newline at end of file