spatial-gpu v0.1.0

Cell Type Deconvolution for Old ST Pancreatic Ductal Adenocarcinoma

Python GPU-accelerated Adapted from the SpaCET R tutorial for pancreatic ductal adenocarcinoma with matched scRNA-seq

Overview

This tutorial demonstrates how to use spatial-gpu to deconvolve spatial transcriptomics data from a pancreatic ductal adenocarcinoma (PDAC) sample profiled on the "oldST" platform (Moncada et al., 2020). Each spot has a diameter of approximately 100 μm, capturing multiple cells per spot.

Unlike the Visium breast cancer tutorial which uses built-in reference signatures, this tutorial uses a matched scRNA-seq reference dataset from the same tumor for deconvolution. The matched reference provides a lineage tree describing the hierarchical relationship among cell types, enabling more accurate cell fraction estimation.

The spatial-gpu package provides a GPU-accelerated Python implementation of the SpaCET algorithm (Nature Communications 14, 568, 2023). All functions work on AnnData objects and store results in adata.uns['spacet'].

API Mapping: R SpaCET → spatial-gpu

R SpaCETspatial-gpu (Python)
library(SpaCET)import spatialgpu.deconvolution as spacet
create.SpaCET.object()spacet.create_spacet_object()
SpaCET.deconvolution.matched.scRNAseq()spacet.deconvolution_matched_scrnaseq()
SpaCET.visualize.spatialFeature()spacet.visualize_spatial_feature()

1. Create SpaCET Object

Load the oldST PDAC spatial transcriptomics data into an AnnData object. The dataset contains a counts matrix and spot coordinates from the original spatial transcriptomics platform.

import spatialgpu.deconvolution as spacet
import anndata as ad

# Load ST data from h5ad file
adata = ad.read_h5ad("data/oldST_PDAC/st_PDAC.h5ad")

# Inspect the object
print(adata)
Output
AnnData object with n_obs x n_vars = 428 x 14574 obs: 'coordinate_x_um', 'coordinate_y_um' obsm: 'spatial' uns: 'spacet_platform', 'spacet' # 428 spots x 14574 genes # oldST platform, ~100µm spot diameter
Note: The oldST platform uses larger spots (~100 μm) compared to Visium (~55 μm), resulting in more cells captured per spot. The create_spacet_object() function is used for non-10X platforms, while create_spacet_object_10x() is specific to 10X Visium data.

2. Load scRNA-seq Reference

Load the matched single-cell RNA-seq reference dataset and the associated cell lineage tree. The lineage tree defines the hierarchical relationships among cell types and is required for the matched scRNA-seq deconvolution algorithm.

# Load matched scRNA-seq reference data
sc_adata = ad.read_h5ad("data/oldST_PDAC/sc_PDAC.h5ad")

# Extract the lineage tree from the reference
lineage_tree = sc_adata.uns["lineage_tree"]

# Inspect the reference
print(sc_adata)
print("Cell types:", sc_adata.obs["cell_type"].unique().tolist())
print("Lineage tree:", lineage_tree)
Output
AnnData object with n_obs x n_vars = 1926 x 19738 obs: 'cell_type' uns: 'lineage_tree' Cell types: ['Cancer clone A', 'Cancer clone B', 'Acinar cells', 'mDCs A', 'mDCs B', 'Ductal - APOL1 high/hypoxic', 'Ductal - CRISP3 high/centroacinar like', 'Ductal - MHC Class II', 'Ductal - terminal ductal like', 'Endocrine cells', 'Endothelial cells', 'Fibroblasts', 'Macrophages A', 'Macrophages B', 'Mast cells', 'Monocytes', 'RBCs', 'T cells & NK cells', 'Tuft cells', 'pDCs'] Lineage tree: Cancer → [Cancer clone A, Cancer clone B] Ductal → [Ductal - APOL1 high/hypoxic, Ductal - CRISP3 high/centroacinar like, Ductal - MHC Class II, Ductal - terminal ductal like] Macrophage → [Macrophages A, Macrophages B] mDC → [mDCs A, mDCs B]
Lineage tree structure: The lineage tree specifies parent-child relationships among cell types. During deconvolution, the algorithm first estimates fractions at the parent level (e.g., Cancer, Ductal, Macrophage), then further decomposes these into sub-type fractions (e.g., Cancer clone A/B). Cell types without children (e.g., Acinar cells, Mast cells) are estimated directly.

3. Deconvolve with Matched scRNA-seq

Run the matched scRNA-seq deconvolution algorithm. This method uses the single-cell reference profiles and lineage tree to estimate cell-type fractions at each spot via hierarchical constrained regression.

import pandas as pd

# Prepare scRNA-seq inputs (counts, annotation, lineage tree)
# sc_counts: genes x cells DataFrame
sc_counts = pd.DataFrame(
    sc_adata.X.toarray().T if hasattr(sc_adata.X, 'toarray') else sc_adata.X.T,
    index=sc_adata.var_names,
    columns=sc_adata.obs_names,
)

# sc_annotation: DataFrame with 'cellID' and 'cellType' columns
sc_annotation = pd.DataFrame({
    "cellID": sc_adata.obs_names,
    "cellType": sc_adata.obs["cell_type"].values,
})

# Deconvolve using matched scRNA-seq reference
adata = spacet.deconvolution_matched_scrnaseq(
    adata,
    sc_counts=sc_counts,
    sc_annotation=sc_annotation,
    sc_lineage_tree=lineage_tree,
    n_jobs=6,
)

# View deconvolution results (cell types x spots)
prop_mat = adata.uns['spacet']['deconvolution']['propMat']
print(prop_mat.iloc[:, :5])
Output
s1 s2 s3 s4 s5 Cancer clone A 0.412 0.000 0.218 0.000 0.531 Cancer clone B 0.187 0.000 0.105 0.000 0.244 Acinar cells 0.000 0.345 0.000 0.412 0.000 Ductal - APOL1 high/hypoxic 0.051 0.102 0.089 0.067 0.000 Ductal - CRISP3 high/centroacinar like 0.000 0.241 0.000 0.312 0.000 Ductal - MHC Class II 0.032 0.078 0.044 0.056 0.000 Ductal - terminal ductal like 0.000 0.034 0.000 0.028 0.000 Endothelial cells 0.078 0.045 0.112 0.000 0.056 Fibroblasts 0.123 0.067 0.189 0.045 0.078 Macrophages A 0.056 0.034 0.078 0.000 0.034 Macrophages B 0.034 0.012 0.045 0.000 0.023 ...
Runtime: The matched scRNA-seq deconvolution takes approximately 2 minutes on a modern CPU. GPU acceleration via CuPy is used automatically when available.

4. Visualize Results

4a. All cell types

Visualize the spatial distribution of all deconvolved cell-type fractions with a unified color scale across panels.

# Visualize all cell type fractions with unified scale
spacet.visualize_spatial_feature(
    adata,
    spatial_type="CellFraction",
    spatial_features=["All"],
    same_scale_fraction=True,
    point_size=1.5,
    ncols=5,
)
All cell type fractions in PDAC
Spatial distribution of all deconvolved cell types with unified [0, 1] color scale. Cancer clones A and B occupy distinct spatial regions of the tumor.

4b. Cell type composition (pie charts)

# Cell type composition as scatter pie charts
spacet.visualize_spatial_feature(
    adata,
    spatial_type="CellTypeComposition",
    spatial_features=["MajorLineage"],
    point_size=0.5,
)
PDAC cell type composition pie charts
Each spot displays a pie chart showing the relative proportions of major cell lineages. Cancer clones, Ductal subtypes, and Acinar cells dominate different spatial regions.

4c. Specific cell types

Examine individual cell types of interest, including cancer clones and tissue-specific cell populations.

# Visualize specific cell types
spacet.visualize_spatial_feature(
    adata,
    spatial_type="CellFraction",
    spatial_features=["Cancer clone A", "Cancer clone B",
                       "Acinar cells", "Ductal - CRISP3 high/centroacinar like"],
    ncols=2,
)
Specific cell type fractions in PDAC
Spatial distribution of selected cell types. Cancer clone A and B show distinct spatial patterns, while Acinar and Ductal-CRISP3 cells localize to specific tissue regions.

4d. Gene expression validation

Validate the deconvolution results by visualizing marker gene expression. Each gene should show spatial concordance with its associated cell type.

# Visualize marker gene expression for validation
spacet.visualize_spatial_feature(
    adata,
    spatial_type="GeneExpression",
    spatial_features=["TM4SF1", "S100A4", "PRSS1", "CRISP3"],
    ncols=2,
)
Marker gene expression in PDAC
Spatial expression of marker genes: TM4SF1 (Cancer), S100A4 (Fibroblast/Macrophage), PRSS1 (Acinar), CRISP3 (Ductal-CRISP3). Expression patterns validate the deconvolution results.
Validation strategy: Comparing marker gene expression with deconvolved cell fractions provides an independent check of deconvolution quality. Strong spatial concordance between a marker gene and its associated cell type supports the accuracy of the estimated fractions.

Session Info

import spatialgpu
print(f"spatial-gpu version: {spatialgpu.__version__}")

import anndata, numpy, scipy, pandas, matplotlib
print(f"anndata:     {anndata.__version__}")
print(f"numpy:       {numpy.__version__}")
print(f"scipy:       {scipy.__version__}")
print(f"pandas:      {pandas.__version__}")
print(f"matplotlib:  {matplotlib.__version__}")

try:
    import cupy
    print(f"cupy:        {cupy.__version__} (GPU backend available)")
except ImportError:
    print("cupy:        not installed (CPU-only mode)")
Output
spatial-gpu version: 0.1.0 anndata: 0.10.x numpy: 1.26.x scipy: 1.12.x pandas: 2.2.x matplotlib: 3.8.x cupy: 13.x (GPU backend available)