This project performs sentiment analysis on Jane Austen's 6 novels using R's tidytext ecosystem. It explores emotional patterns across books, chapters, and storylines using multiple sentiment lexicons.
- Sense and Sensibility
- Pride and Prejudice
- Mansfield Park
- Emma
- Northanger Abbey
- Persuasion
- BING — Positive / Negative classification
- AFINN — Numeric score (-5 to +5)
- NRC — 10 emotion categories
- janeaustenr
- tidytext
- dplyr
- stringr
- ggplot2
install.packages("janeaustenr") install.packages("tidytext") install.packages("tidyverse")
See below for full folder structure.s Full Project Folder Structure Sentiment-Analysis-Austen-R/ │ ├── data/ │ ├── raw_text/ # raw novel text if needed │ └── processed/ # cleaned tidy data (CSV exports) │ ├── scripts/ │ ├── 01_load_data.R # load janeaustenr + tidy format │ ├── 02_preprocessing.R # tokenization, stopwords removal │ ├── 03_bing_analysis.R # positive/negative with BING │ ├── 04_afinn_analysis.R # numeric scoring with AFINN │ ├── 05_nrc_analysis.R # emotion categories with NRC │ └── 06_visualization.R # all ggplot2 charts │ ├── outputs/ │ ├── plots/ # saved PNG charts │ │ ├── bing_sentiment.png │ │ ├── afinn_scores.png │ │ └── nrc_emotions.png │ └── results/ # CSV result tables │ ├── reports/ │ └── sentiment_report.Rmd # R Markdown final report │ ├── .gitignore ├── README.md └── sentiment_analysis.Rproj # R project file