Skip to content

Commit

Permalink
resources added
Browse files Browse the repository at this point in the history
  • Loading branch information
apdullah.yayik committed Mar 13, 2021
1 parent 0c6c240 commit 82e1c1b
Showing 1 changed file with 8 additions and 8 deletions.
16 changes: 8 additions & 8 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -4,14 +4,9 @@
[![PyPI](https://badge.fury.io/py/tensorflow.svg)](https://badge.fury.io/py/trtokenizer)

TrTokenizer is a complete solution for Turkish sentence and word tokenization with extensively-covering language
conventions.

If you think that Natural language models always need robust, fast, and accurate tokenizers, be sure that you are at the
the right place now.

Sentence tokenization approach uses non-prefix keyword given in 'tr_non_suffixes' file. This file can be expanded if
conventions. If you think that Natural language models always need robust, fast, and accurate tokenizers, be sure that you are at the
the right place now. Sentence tokenization approach uses non-prefix keyword given in 'tr_non_suffixes' file. This file can be expanded if
required, for developer convenience lines start with # symbol are evaluated as comments.

Designed regular expressions are pre-compiled to speed-up the performance.

## Install
Expand Down Expand Up @@ -42,4 +37,9 @@ word_tokenizer_object.tokenize(<given sentence as string>)
- Release platform specific shared dynamic libraries (Done, build/tr_tokenizer.cpython-38-x86_64-linux-gnu.so, only for
Debian Linux with gcc compiler)
- Limitations
- Prepare a simple guide for contribution
- Prepare a simple guide for contribution

## Resources

* [Speech and Language Processing](https://web.stanford.edu/~jurafsky/slp3/)
* [Bogazici University CMPE-561](https://www.cmpe.boun.edu.tr/tr/courses/cmpe561)

0 comments on commit 82e1c1b

Please sign in to comment.