NLP Text Classification with Naive Bayes vs Logistic Regression
In this article, we are going to be examining the distinction between using a Logistic Regression and Naive Bayes for text classification of coffee drink reviews as positive or negative.
Dataset used is a corpus of a coffee drinks reviews, the consumers reviews about how the drink tastes. A quick one on the reminder into Logistic Regression and Naive Bayes. They are both machine learning techniques for binary classification, which was used to determine whether the review is going to be positive or negative.
Logistic Regression
Naive Bayes
Step 1: Prepare the data
Step 2: Data Processing
Step 3: Split the data set into test and training set
Step 4:Numerically encode the input data set
Step 5: Fit the model
Step 6: Evaluate the model
Error Metric with Logistic Regression
Accuracy: 0.866
Precision: 0.874
Recall: 0.944
F1 Score: 0.908
Confusion Matrix with Logistic Regression
Error Metric with Naive Bayes
Accuracy: 0.866
Precision: 0.858
Recall: 0.968
F1 Score: 0.91
Confusion Matrix with Naive Bayes
We can see that obtain a better metric with Naive Bayes Classification
To get the source code and the data set, please go through my GitHub repository