Skip to content

AWS EMR scale benchmarks (50GB+) #7

@AayushBarhate

Description

@AayushBarhate

What

Run all approaches on AWS EMR (m5.xlarge x3) with 50GB+ data. 3 traffic scenarios × 4 approaches × 5 runs = 60 benchmark runs.

Why

Local testing on 2-core VM doesn't prove the system works at production scale. Paper needs scale numbers.

Steps

  1. Provision EMR cluster (3x m5.xlarge) + MSK Kafka
  2. Deploy agent JAR, predictor, generator
  3. Run experiment matrix: 3 scenarios (steady, periodic burst, random burst) × 4 approaches × 5 runs
  4. Collect latency CDFs, throughput curves, CPU/memory profiles
  5. Run Wilcoxon signed-rank tests for statistical significance

Acceptance criteria

  • 60 completed runs with raw data
  • Publication-quality latency CDF plots
  • Statistical significance confirmed (p < 0.05)

Metadata

Metadata

Assignees

No one assigned

    Labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions