Skip to content

Vibhanshu230/Text-Classification

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

4 Commits
 
 
 
 

Repository files navigation

Text-Classification

Converted the raw documents fetched from the 20newsgroups dataset into a vocabulary frequency table discounting the stop words.
Created a dictionary and engineered a Multi Naive Bayes function to classify the documents and it achieved an accuracy of 86%.
Printed the classification report for both inbuilt and self-engineered implementations and approximately got the same accuracy in both of them.
I have imported the stopwords from nltk and copied more of them from internet.
Instead of split() function, I am using the tokenizer which makes the job much easier.
Instead of manually downloading the data from the internet, I have downloaded it using sklearn.datasets.fetch_20newsgroups

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors