Skip to content

Fake new detection using text classification as real or fake news segments. Required installations - Python 3.8, NLTK, Scikit-Learn, Jupyter. Text cleaning, tokenization, vectorization, classification model generation and evaluation.

License

Notifications You must be signed in to change notification settings

Shubha23/Fake-News-Detection-Text-Preprocessing-and-Classification

Repository files navigation

Aim : Fake news detection by classification of real versus fake news pieces.

Dataset: Kaggle.com (Fake News Balanced dataset for fake news analysis data)

Files: Jupyter Notebook (Python), zipped dataset file.

Author : Shubha Mishra

The project accomplishes these tasks :

Text cleaning and preprocessing of fake_or_real_news dataset using NLTK and Regex library.

Creating and transforming clean text into tf-idf vectors.

Learning models like Passive Aggressive Classifier, XGBoost and LGBM to perform classification of fake and real news pieces. (A few other algorithms were also tried but only the best performers are chosen here.)

Evaluate each model's performance based on the accuracy scores and confusion matrices they produced.

Please see the notebook for details on each of these steps.

-------------------------------------------------------------- End of file ------------------------------------------------------------------------

About

Fake new detection using text classification as real or fake news segments. Required installations - Python 3.8, NLTK, Scikit-Learn, Jupyter. Text cleaning, tokenization, vectorization, classification model generation and evaluation.

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published