Skip to content

KarchinLab/pictographPlus

Repository files navigation

PICTographPlus

DOI

Overview

PICTographPlus is a computational tool that integrates bulk DNA and RNA sequencing data to:

  1. Reconstruct Clone-Specific Transcriptomic Profiles
  2. Infer Tumor Evolution
  3. Identify Transcriptional Transitions Between Clones

The tool infers tumor clonal evolution from single or multi-region sequencing data by modeling the uncertainty of mutation cellular fraction (MCF) in small somatic mutations (SSMs) and copy number alterations (CNAs). Using a Bayesian hierarchical model, it assigns SSMs and CNAs to subclones, reconstructing tumor evolutionary trees that adhere to principles of lineage precedence, sum condition, and optional constraints based on sample presence. For deconvolution, PICTographPlus integrates tumor clonal tree structures with clone proportions across samples to resolve bulk gene expression data. It optimizes an objective function that minimizes discrepancies between observed and predicted sample-level gene expression while imposing a smoothness penalty, ensuring that closely related clones display greater gene expression similarity. Lastly, the tool conducts pathway enrichment analysis to identify statistically significant alterations in pathways connecting tumor clones.

Core Modules

  • runPictograph – Tumor evolution inference using genomic data
  • runDeconvolution – Bulk RNA expression deconvolution based on tumor evolution
  • runGSEA – Gene Set Enrichment Analysis (GSEA) for transcriptomic differences between clones

Key Features

  • Uses Bayesian hierarchical modeling to infer tumor clonal evolution.
  • Deconvolves bulk gene expression data using tumor clonal tree structures with 7 model variants.
  • Performs pathway enrichment analysis to highlight significant transcriptomic alterations.

Model Selection Quick-Reference

Scenario Recommended model Why
Default — matched normal sample available elastic_net (λ=0.01) Best synthetic F1 (0.347) and Sensitivity (0.368).
With-normal, if interpretability favoured tree_delta (λ=0.05) Nearly-tied F1 (0.339) with an explicit tree-structured prior; also strongest on with_extnorm.
With-normal, prioritise precision / low-FDR adaptive (λ=0.50) Highest MCC in with_normal (0.248).
Tumor-only (no normal reference) adaptive_v2 (λ=0.50) Best F1 (0.348), Sensitivity (0.360), MCC (0.256).
External (population-average) normal only tree_delta (λ=0.05) Best F1 (0.293) and Sensitivity (0.276).
Highest absolute expression recovery (Pearson r) plain (λ=0.10) Top star-topology Pearson r (0.942) among 7 models.

Installation

Install JAGS (Required for Bayesian Analysis)

JAGS must be installed separately. Download it from: https://mcmc-jags.sourceforge.io

Install PICTographPlus

Run the following command in R:

# Install from GitHub
install.packages("devtools")
devtools::install_github("KarchinLab/pictographPlus", build_vignettes = TRUE)

Package versions during development

PICTographPlus was developed under R (4.4.2). All package versions during development can be found at installed_packages.csv


Access Tutorial

Detailed tutorial can be accessed through vignette.

library(pictographPlus)
vignette("pictographPlus", package = "pictographPlus")

Citation

If you use PICTographPlus in your research, please cite the archived software release:

Lai J, Yang Y, Karchin R (2026). pictographPlus: Reconstructing Clone-Resolved Transcriptional Programs from Bulk Tumor Sequencing. R package version 1.1.1. Zenodo. https://doi.org/10.5281/zenodo.19896732

You can also retrieve a bibentry from R with:

citation("pictographPlus")

Data

Processed data generated in the companion study — clone-level expression matrices, inferred clonal trees, and edge-level GSEA results — are available on Mendeley Data at https://doi.org/10.17632/cv66sgfcfn.2.

The benchmarks and applications in the manuscript use the following public datasets:

Dataset Accession
wellDR-seq scDNA/scRNA co-profile (benchmarking) GEO: GSE261713
IPMN WES + RNA-seq dbGaP: phs002225.v3.p1
TRACERx NSCLC WES EGA: EGAD00001009825
TRACERx NSCLC RNA-seq EGA: EGAD00001009862
PanCuRx PDAC WGS EGA: EGAD00001004551
PanCuRx PDAC RNA-seq EGA: EGAD00001004548

All datasets were used in accordance with the respective data access agreements.

About

pictograph with copy number added

Resources

License

Stars

Watchers

Forks

Packages

 
 
 

Contributors