sentiment-analysis-leveraging-lstm

Sentiment Analysis is determining whether a written piece of text has a positive, neutral, or negative connotation. These written pieces of text are usually the reviews that are left by customers once they use products, brands, services, and so forth. These reviews give an insight into how appealing or off-putting a particular product, brand, or service was to the customer. These insights are extremely useful because they are not only an indicator of customer satisfaction but also companies can use them to drive business decisions.

Sentiment Analysis models are built leveraging a deep learning approach utilizing the customer reviews of Amazon products. Since Long Short Term Memory Network (LSTM) is very effective in dealing with long sequence data and learning long-term dependencies, it is used for automatic sentiment classification of future product reviews.

NOTE: The image above is generated through DALL·E preview app.

Highlights

Following are the highlights of the project:

Sentiment Analysis of Amazon Product Reviews using an imbalanced dataset
The initial sentiment model is trained and evaluated using the following sentiment distribution:
- Positive Reviews: 89.02%
- Neutral Reviews: 5.09%
- Negative Reviews: 5.71%
Usage of pre-trained GloVe Word Embeddings
Explored different settings to build the sentiment model based on the following:
- Batch Size
- Number of LSTM Layers
- Number of Units per LSTM Layer
- Dropout Values
- Absence or Presence of Dense Layer before the output layer
- Epochs
- Patience during Early Stopping
- Word Stemming or Lemmatizing
Trained and Evaluated additional sentiment models by addressing the imbalance in data using the following methods:
- Assigned class weights during the model training
- Used SMOTE to synthetically create the oversampled data
Comparison between different sentiment models

Dataset

Consumer Reviews of Amazon Products is the dataset that will be used. It has a reasonable dimension i.e. it has over 34,000 consumer reviews for Amazon products like the Kindle, Fire TV Stick, and so forth. The dataset includes basic product information such as name, review title, review text, review rating, and more for each product. The dataset is publicly available on Kaggle.

In this dataset, the column reviews.rating has values ranging from 1 to 5. These values will be updated so that each of them corresponds to a sentiment. Values 1 and 2 will be treated as a negative sentiment, value 3 will be treated as a neutral sentiment, and values 4 and 5 will be treated as a positive sentiment. Additionally, the column reviews.text holds the reviews.

Approach

Exploratory Dataset Analysis is done for the above-mentioned dataset. The text column is cleaned and the data is then split into training, testing, and validation sets. Further, the data is tokenized and padded followed by preparing the word embeddings that helps in setting up the embedding layer for the sentiment model. Evaluation Metric is finalized and different settings are explored to build the sentiment model. Apart from the initial model that is trained and evaluated using the imbalanced data, two other models are built. One of the models is trained using class weights and the other model is trained using synthetically oversampled data. Finally, the results are compared for different models trained and evaluated under the best setting.

Information About Files

dataset/1429_1.csv: Dataset of 34,660 consumer reviews for Amazon products
dataset/additional_dataset.txt: Provides links to additional dataset of 5,000 + 28,000 consumer reviews for Amazon
screenshot/people-sentiment.png: Screenshot of the people with negative, neutral, and positive facial expressions
screenshot/sentiment-distribution.png: Screenshot of the imbalanced dataset
screenshot/results.png: Screenshot of a few results
sentiment-analysis-lstm.ipynb: Google Colab notebook for the project

License

This project is licensed under the MIT License and for more details, see the LICENSE.md file

References

Here are some references I looked at while working on this project:

Papers

K. Baktha and B. K. Tripathy, "Investigation of recurrent neural networks in the field of sentiment analysis," 2017 International Conference on Communication and Signal Processing (ICCSP), 2017, pp. 2047-2050, doi:10.1109/ICCSP.2017.8286763.
T. Kati ́c and N. Mili ́cevi ́c, ”Comparing Senti- ment Analysis and Document Representation Meth- ods of Amazon Reviews,” 2018 IEEE 16th Inter- national Symposium on Intelligent Systems and In- formatics (SISY), 2018, pp. 000283-000286, doi: 10.1109/SISY.2018.8524814.
J. C. Gope, T. Tabassum, M. M. Mabrur, K. Yu and M. Arifuzzaman, ”Sentiment Analysis of Ama- zon Product Reviews Using Machine Learning and Deep Learning Models,” 2022 International Con- ference on Advancement in Electrical and Electronic Engineering (ICAEEE), 2022, pp. 1-6, doi: 10.1109/ICAEEE54957.2022.9836420.
N. Sharm, T. Jain, S. S. Narayan and A. C. Kan- dakar, ”Sentiment Analysis of Amazon Smartphone Reviews Using Machine Learning Deep Learning,” 2022 IEEE International Conference on Data Science and Information System (ICDSIS), 2022, pp. 1-4, doi: 10.1109/ICDSIS55133.2022.9915917.

Links and Blogs

End Notes

Did you find this project useful? Which other setting do you think can be explored? In which other way can the imbalance in this data be handled? Feel free to discuss your experiences on the discussion portal, and I'll be more than happy to discuss.

Back to Top

Name		Name	Last commit message	Last commit date
Latest commit History 13 Commits
dataset		dataset
screenshot		screenshot
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
sentiment-analysis-lstm.ipynb		sentiment-analysis-lstm.ipynb

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

sentiment-analysis-leveraging-lstm

Table of Contents

Highlights

Dataset

Approach

Information About Files

License

References

Papers

Links and Blogs

End Notes

About

Releases

Packages

Languages

License

hardikasnani/sentiment-analysis-leveraging-lstm

Folders and files

Latest commit

History

Repository files navigation

sentiment-analysis-leveraging-lstm

Table of Contents

Highlights

Dataset

Approach

Information About Files

License

References

Papers

Links and Blogs

End Notes

About

Topics

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages