Results and Conclusion | Data Quality Classif

RESULTS

Decision Tree model for sentiment analysis has achieved moderate success, with an accuracy of 71.25%. The precision, recall, and F1 score are also in a similar range, indicating a balanced performance between the model's ability to identify positive sentiments accurately and its coverage of the actual positive sentiments in the test data.

Discussion of Results

Accuracy (71.25%) indicates that the model correctly predicts the sentiment of approximately 71 out of every 100 reviews. While promising, there's room for improvement, especially in domains where higher accuracy is critical.
Precision (65.45%) shows that when the model predicts a review as positive, it is correct about 65% of the time.
Recall (64.10%) reveals that the model successfully identifies 64% of all actual positive reviews.
F1 Score (64.77%) is a harmonic mean of precision and recall, indicating the balance between them.

Confusion Matrix

Screenshot 2024-03-27 at 12.38.56 AM.png

Analyzing the Confusion Matrix

True Negatives (TN): 1793 instances were correctly predicted as negative sentiment.
True Positives (TP): 1057 instances were correctly predicted as positive sentiment.
False Positives (FP): 558 instances were incorrectly predicted as positive sentiment.
False Negatives (FN): 592 instances were incorrectly predicted as negative sentiment.

This confusion matrix offers valuable insights:

The model is better at identifying negative sentiments than positive ones, as indicated by the higher number of true negatives.
There are considerable false negatives and false positives, suggesting areas where the model's performance could be improved.

Tree with depth 1

Tree with depth 3

Tree with depth 5

The insights gained from using a Decision Tree for sentiment analysis are akin to unraveling the layers of human communication. The model's moderate success in deciphering positive and negative sentiments from text reviews teaches us that while it can grasp the more overt expressions, the subtle tones woven into language often elude its binary questioning approach. This highlights the importance of not only having ample and diverse data but also ingeniously engineered features that capture the essence of human sentiment. Although the Decision Tree stands as a transparent and straightforward model, the complexity of natural language suggests a need for more advanced methods or a collective strength of an ensemble of models to enhance accuracy and reliability for critical applications. The path forward, illuminated by the model's performance, is one of refinement and sophistication, moving towards a nuanced understanding of sentiment in text.

RESULTS

Discussion of Results

CONCLUSION