top of page

DATA PREPRATION
1.TEXT SUMMARIZATION USING NEURAL NETWORKS
​
Dataset used for NN

(b) Training and Testing Sets
-
Splitting Data: The labeled dataset must be divided into two disjoint sets: a Training Set and a Testing Set. The Training Set is used to build and train the SVM model, while the Testing Set is used to evaluate its performance and ensure that the model generalizes well to new, unseen data.
-
Why Disjoint: The Training and Testing sets must be disjoint to prevent overfitting. Overfitting occurs when a model learns the details and noise in the training data to an extent that it negatively impacts the performance of the model on new data.
​
Link to the data can be found here
​
Training Dataset

Testing Dataset

CODE
bottom of page