Clinical and RWE analytics using EHR data, reproducible workflows, and biomedical research experience.
This portfolio is focused on recruiter review for:
- Healthcare Data Analytics
- Clinical Data Analytics
- Real-World Evidence / RWE
- Epidemiology / Public Health Analytics
- Research Data Analytics
Strengths demonstrated across the projects include SQL-based cohort construction, EHR and MIMIC-IV workflows, logistic regression, survival analysis, absolute risk interpretation, SAS/Python validation, Quarto reporting, and reproducible analytics documentation.
| Project | Best For | Methods | Tools |
|---|---|---|---|
| COPD ICU RAAS Survival Analysis | RWE, clinical data, survival analysis | Kaplan-Meier, Cox proportional hazards, sensitivity analysis | BigQuery, SQL, Python, SAS, Quarto |
| Non-ICU RAAS Mortality Analysis | Clinical analytics, medication outcomes, absolute risk reporting | Logistic regression, marginal effects, sensitivity analysis | BigQuery, SQL, Python, SAS, Quarto |
| Public Health Statistics Workflow | Epidemiology, public health analytics | Descriptive statistics, age adjustment, regression, forest plots | Python, Quarto |
- COPD ICU RAAS Survival Analysis
- Non-ICU RAAS Mortality Analysis
- Public Health Statistics Workflow
Review the COPD project first for survival analysis and RWE-style clinical modeling, the non-ICU project second for logistic regression and absolute risk interpretation, and the public health workflow third for epidemiology and population-health statistics.
- MIMIC-IV cohort construction
- BigQuery SQL
- Kaplan-Meier survival analysis
- Cox proportional hazards modeling
- Sensitivity analysis
- SAS/Python validation
- Quarto reporting
- Non-ICU hospital admission cohort construction
- Medication exposure definition
- Multivariable logistic regression
- Adjusted predicted risk estimation
- Marginal effects analysis
- SAS/Python validation
- Quarto reporting
- Descriptive public-health statistics
- Age-group adjustment
- Linear regression
- Logistic regression
- Forest plot visualization
- Reproducible reporting
All clinical projects use de-identified MIMIC-IV data under appropriate PhysioNet access requirements. No patient-level source data, PHI, credentials, or restricted datasets are included in the repositories. Reproducibility documentation is provided through README files, Quarto reports, and REPRODUCIBILITY.md files where applicable.
- SQL / BigQuery
- Python
- pandas / NumPy
- statsmodels / lifelines / scikit-learn
- SAS
- Quarto
- Git / GitHub
- GitHub Pages
- GitHub: https://github.com/makotoy56
- LinkedIn: https://www.linkedin.com/in/makoto-yoshida
- ORCID: https://orcid.org/0009-0002-5201-2743
