Alpha-Guard is a fully automated, cloud-hosted end-to-end Data Engineering pipeline. It extracts live stock market data and financial news feeds, processes statistical risk metrics and NLP sentiment scores, and loads the transformed data into a remote PostgreSQL database for live Business Intelligence reporting.
The architecture is entirely serverless, relying on GitHub Actions for CI/CD orchestration to execute daily micro-batch ETL jobs with zero manual intervention.
- Extract: Pulls historical tick data via
yfinanceand financial news headlines via Yahoo RSS XML feeds. - Transform (Python): - Calculates 20-day rolling volatility and flags 3-sigma price anomalies.
- Applies Natural Language Processing (NLP) using
vaderSentimentto score headline sentiment (-1.0 to 1.0). - [v1.1 Update] Merges datasets to calculate the 20-Day Rolling Pearson Correlation between daily stock returns and news sentiment.
- Applies Natural Language Processing (NLP) using
- Load (SQLAlchemy): Pushes cleaned, structured data into a Neon Serverless PostgreSQL database utilizing strict
NUMERIC(10, 4)precision to prevent float rounding errors. - Automate (CI/CD): Scheduled cron jobs via GitHub Actions run the pipeline every weekday at 10:00 PM UTC.
- Visualize: A scheduled Power BI Semantic Model fetches the fresh cloud data to update the dashboard automatically.
- Languages: Python, SQL
- Libraries:
pandas(Data manipulation & Statistical math),SQLAlchemy,vaderSentiment,nltk,yfinance - Cloud & Infrastructure: GitHub Actions (Runner & Secrets Vault), Neon.tech (Cloud DB)
- Security: Decoupled architecture using runtime Environment Variables to secure database credentials.
- Analytics: Power BI (Scheduled Refresh)
- Resilient Automation: Replaced manual local execution with a containerized Ubuntu GitHub runner.
- Precision Data Engineering: Enforced strict
NUMERIC(10, 4)SQL schemas over standard floats to prevent floating-point rounding errors in sensitive financial percentage data. - Secure Credential Management: Database URIs are never hardcoded. Handled dynamically via
os.environand GitHub Secrets. - Schema Handling: Implemented protocol standardizations (auto-converting
postgres://topostgresql://for SQLAlchemy compatibility). - Fault Tolerance: Built-in try/except blocks to prevent broken RSS XML nodes from failing the entire pipeline.
- Clone the repository:
git clone [https://github.com/Soni-Test/Alpha-Guard.git](https://github.com/Soni-Test/Alpha-Guard.git) cd Alpha-Guard