Skip to content

cspence001/crypto_sentiment_analysis

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

68 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

crypto_sentiment_analysis

Crypto Sentiment Analysis

Crypto Sentiment Analysis is a proof-of-concept application designed to determine and evaluate social sentiment of the Dogecoin cryptocurrency, analyzing its correlation with the cryptocurrency's stock ticker value. This repository contains a comprehensive two-part project focused on sentiment analysis and machine learning classification, particularly during Dogecoin's heightened volatility over a span of three months in 2021.

Sentiment Analysis

Comment Extraction

  • Utilizing the Reddit API, this project extracts thread ID data from the "Daily Discussion" threads within the subreddit r/doge for each day of the three-month period.
  • The PMAW third-party wrapper facilitates the batch extraction of a total of 554k comments contained in these "Daily Discussion" threads.

Sentiment Analysis using VADER

  • The sentiment analysis process involves determining the polarity/compound, positive, negative, and neutral scores of each comment on a scale from -1 to 1 using VADER, a sentiment analysis tool specifically attuned to social media content.
  • Based on the polarity/compound score, each comment is attributed an overall positive, negative, or neutral rating.

Stock Value Correlation

  • Utilizing the CoinGecko API, this project extracts 5-minute interval data of Dogecoin's stock value over the span of three months.
  • By parsing comment scores and using timestamps by interval, the project calculates the mean average of each comment score (compound, positive, negative, neutral) for every 5-minute interval.
  • This data is then analyzed successively in tandem with the stock ticker value for plot/chart use, allowing for correlation analysis.

Machine Learning Classification

Evaluation of Classification Process

  • Using the VADER compound score of each comment, an overall determination of comment rating (positive, negative, neutral) is attributed to each comment.
  • These ratings are then applied in training models to determine prediction accuracy using Naive Bayes and Random Forests based on the VADER classification.

Performance Results

  • The project includes detailed analyses of the process and performance results of the machine learning classification.
  • Accuracy prediction results for Naive Bayes and Random Forest models are presented, along with a classification report and Confusion Matrix Heatmap display as determined by the Random Forest model.

Conclusion

Crypto Sentiment Analysis provides valuable insights into the social sentiment surrounding Dogecoin, offering a nuanced understanding of its fluctuations in correlation with stock ticker value. This repository serves as a comprehensive resource for those interested in sentiment analysis and machine learning applications within the cryptocurrency domain.

application deployment [heroku down, use local]


Local Application Deployment
git clone https://github.com/cspence001/crypto_sentiment_analysis.git
cd crypto_sentiment_analysis
python3 app.py

About

Doge sentiment analysis and stock correlation. Reddit comment extraction, ML evaluation using Naive Bayes, Random Forest.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors