Skip to content

Viv3ckj/update report apr#206

Open
viv3ckj wants to merge 5 commits into
mainfrom
viv3ckj/update-report-apr
Open

Viv3ckj/update report apr#206
viv3ckj wants to merge 5 commits into
mainfrom
viv3ckj/update-report-apr

Conversation

@viv3ckj
Copy link
Copy Markdown
Contributor

@viv3ckj viv3ckj commented May 13, 2026

What are we trying to achieve?

We received a request via email asking whether the raw data behind the Pharmacy First monthly dashboard could also be made available for download alongside the report:
https://reports.opensafely.org/reports/opensafely-pharmacy-first-monthly-dashboard/

The aim is to make it easier for users to reuse and analyse the dashboard data themselves.

What is the problem?

Although we could publish the current outputs directly (pf_breakdown_measures.csv and pf_descriptive_stats_measures.csv), these tables are analysis-oriented and difficult for external users to work with.

Current issues include:

  • many mostly-empty (NA) columns
  • grouping information embedded in measure names (e.g. _by_sex, _by_imd)
  • internal variable names
  • multiple stratifications combined into sparse wide tables
  • columns that are not useful to dashboard users

For example:

measure age_band sex imd
count_impetigo_by_sex NA Female NA

This structure is difficult to filter/pivot and requires users to interpret metadata encoded inside the measure names.

Proposed approach

This PR creates:

  • a new script: analysis/create_monthly_tables.R
  • a new action: generate_pf_opensafely_monthly_report_tables

These generate new user-facing downloadable tables from the existing analysis outputs.

The downloadable outputs would be:

  • pf_consultations_with_breakdowns.csv
  • pf_completeness.csv

These are new outputs and would sit alongside the existing outputs rather than replace them.

The cleaned tables:

  • use a tidier structure (measure_type, group_type, group)
  • remove unnecessary columns
  • simplify measure names
  • separate grouping metadata from measure names
  • make the outputs easier to filter/pivot in Excel or downstream analysis

Example proposed structure:

measure_type measure group_type group value
clinical_condition impetigo Sex Female 123

Alternatives considered

An alternative would be to publish the current analysis outputs directly.

However, this would make the downloads harder for external users to understand and work with, as the current tables are designed for report generation rather than public use.

Another option would be to create many separate CSVs for each stratification/figure, but this would increase maintenance burden and make the downloads harder to navigate.

The proposed approach keeps the number of downloadable files small while making the structure more user friendly.

Closes #205

The downloadable breakdown and descriptive tables will now look like this:

image image

@viv3ckj viv3ckj force-pushed the viv3ckj/update-report-apr branch from fcd507c to 1ca8623 Compare May 14, 2026 12:25
@viv3ckj viv3ckj requested a review from milanwiedemann May 14, 2026 12:42
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Publish CSVs to accompany figures on the Pharmacy First monthly dashboard

1 participant