Release Version 1.2 · mesolitica/malaya

Released emotion analysis, https://malaya.readthedocs.io/en/latest/Emotion.html
Added sparse fast-text-char deep learning model for sentiment, emotion, and subjectivity analysis.

Sparse deep learning models

What happen if a word not included in the dictionary of the models? like setan, what if setan appeared in text we want to classify? We found this problem when classifying social media texts / posts. Words used not really a vocabulary-based contextual.

Malaya will treat unknown words as <UNK>, so, to solve this problem, we need to use N-grams character based. Malaya chose tri-grams until fifth-grams.

setan = ['set', 'eta', 'tan']
Sklearn provided easy interface to use n-grams, problem is, it is very sparse, a lot of zeros and not memory efficient. Sklearn returned sparse matrix for the result, lucky Tensorflow already provided some sparse function.

simply call, malaya.sentiment.sparse_deep_model(), malaya.subjective.sparse_deep_model(), malaya.emotion.sparse_deep_model()

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Version 1.2

Sparse deep learning models