top of page
MicrosoftTeams-image.png

DATA PREPRATION

1.TEXT SUMMARIZATION USING NEURAL NETWORKS

​

Dataset used for NN

image.png

(b) Training and Testing Sets

  • Splitting Data: The labeled dataset must be divided into two disjoint sets: a Training Set and a Testing Set. The Training Set is used to build and train the SVM model, while the Testing Set is used to evaluate its performance and ensure that the model generalizes well to new, unseen data.

  • Why Disjoint: The Training and Testing sets must be disjoint to prevent overfitting. Overfitting occurs when a model learns the details and noise in the training data to an extent that it negatively impacts the performance of the model on new data.

​

Link to the data can be found here

​

Training Dataset

image.png

Testing Dataset

image.png

CODE

Code for Text Summarization using Python can be found here.

​

​

​

bottom of page