Text preprocessing pipeline for my graduation project. Pipeline includes sentence boundary detection, sentence tokenizer, stemmer, disambugiator and POS TAG. This pipeline uses Turkish NLP library zemberek-nlp by Ahmet A. Akın and Turkish Deasciifier for Java by Ahmet Alp Balkan.
Type | Number of Reviews |
---|---|
Positive | 220,284 |
Negative | 14,881 |
- JAVA 8
- Maven