spatial-gpu v0.1.0

Cell Type Deconvolution for High-Resolution Slide-seq Colorectal Cancer

Python GPU-accelerated Adapted from the SpaCET R tutorial for colorectal cancer Slide-seq

Overview

This tutorial demonstrates how to use spatial-gpu to deconvolve high-resolution spatial transcriptomics data from a colorectal cancer (CRC) sample profiled using the Slide-seq platform (Zhao et al., 2021). Each 10 μm bead captures approximately 1–2 cells, providing near single-cell resolution.

The high-resolution nature of Slide-seq data means that each bead is expected to contain at most 1–2 cells, making the deconvolution problem different from lower-resolution platforms like Visium (~55 μm) or oldST (~100 μm). The SpaCET algorithm handles this by adapting its cell fraction estimation to the platform resolution.

The spatial-gpu package provides a GPU-accelerated Python implementation of the SpaCET algorithm (Nature Communications 14, 568, 2023). All functions work on AnnData objects and store results in adata.uns['spacet'].

API Mapping: R SpaCET → spatial-gpu

R SpaCETspatial-gpu (Python)
library(SpaCET)import spatialgpu.deconvolution as spacet
create.SpaCET.object()spacet.create_spacet_object()
SpaCET.deconvolution()spacet.deconvolution()
SpaCET.visualize.spatialFeature()spacet.visualize_spatial_feature()

1. Create SpaCET Object

Load the CRC Slide-seq data into an AnnData object. The dataset contains 10 μm beads with spatial coordinates and gene expression counts.

import spatialgpu.deconvolution as spacet
import anndata as ad

# Load Slide-seq CRC data from h5ad file
adata = ad.read_h5ad("data/hiresST_CRC/hiresST_CRC.h5ad")

# Inspect the object
print(adata)
Output
AnnData object with n_obs x n_vars = 18288 x 16270 obs: 'coordinate_x_um', 'coordinate_y_um' obsm: 'spatial' uns: 'spacet_platform', 'spacet' # 18288 beads x 16270 genes # SlideSeq platform, ~10µm bead diameter
High-resolution platform: Slide-seq beads are ~10 μm in diameter, capturing only 1–2 cells each. This results in sparser gene expression per bead compared to Visium, but provides much higher spatial resolution. Here we load from a pre-processed h5ad file. To create from raw counts and coordinates, use spacet.create_spacet_object(counts, spot_coordinates, platform="SlideSeq").
Dataset size: With 18,288 beads, this dataset is significantly larger than the Visium example (3,813 spots). Deconvolution runtime scales with the number of spatial units. GPU acceleration is especially beneficial for datasets of this size.

2. Deconvolve ST Data

Run the two-stage SpaCET deconvolution algorithm for colorectal cancer. Stage 1 estimates the malignant cell fraction using CRC-specific copy number alteration and expression signatures. Stage 2 decomposes the non-malignant fraction into immune and stromal cell types using hierarchical constrained regression.

# Run two-stage deconvolution for colorectal cancer (CRC)
adata = spacet.deconvolution(adata, cancer_type="CRC", n_jobs=6)

# View deconvolution results (cell types x spots)
prop_mat = adata.uns['spacet']['deconvolution']['propMat']
print(f"Cell types: {prop_mat.shape[0]}")
print(f"Beads: {prop_mat.shape[1]}")
print(prop_mat.iloc[:, :6])
Output
Cell types: 13 Beads: 18288 b1 b2 b3 b4 b5 b6 Malignant 0.823 0.000 0.912 0.000 0.756 0.000 CAF 0.000 0.312 0.000 0.087 0.000 0.245 Endothelial 0.000 0.234 0.000 0.156 0.000 0.189 Plasma 0.000 0.000 0.000 0.000 0.000 0.000 B cell 0.000 0.098 0.000 0.067 0.000 0.045 T CD4 0.000 0.045 0.000 0.123 0.000 0.078 T CD8 0.000 0.034 0.000 0.089 0.000 0.056 NK 0.000 0.000 0.000 0.034 0.000 0.000 cDC 0.000 0.000 0.000 0.000 0.000 0.012 pDC 0.000 0.000 0.000 0.000 0.000 0.000 Macrophage 0.000 0.156 0.000 0.234 0.000 0.178 Mast 0.000 0.000 0.000 0.023 0.000 0.000 Neutrophil 0.000 0.000 0.000 0.000 0.000 0.000
Runtime: Deconvolution of 18,288 beads takes approximately 30 minutes on a modern CPU. Using a GPU with CuPy can reduce this to under 5 minutes. Use n_jobs to control the number of parallel CPU workers.
High-resolution interpretation: Because each Slide-seq bead captures only 1–2 cells, the deconvolution results are closer to discrete cell-type assignments than continuous mixtures. Most beads will show a dominant cell type with fraction close to 1.0.

3. Visualize Results

3a. Cell type fractions

Visualize selected cell-type fractions across the tissue. Use a smaller point_size to accommodate the high density of Slide-seq beads.

# Visualize selected cell type fractions
spacet.visualize_spatial_feature(
    adata,
    spatial_type="CellFraction",
    spatial_features=["Malignant", "CAF", "Endothelial"],
    point_size=0.6,
    same_scale_fraction=True,
)
CRC cell type fractions
Spatial distribution of Malignant, CAF, and Endothelial cell fractions in the CRC Slide-seq dataset. Color scale: blue (low) → yellow → red (high). Malignant cells cluster in the tumor core, while CAF and Endothelial cells are enriched in the stroma.

3b. Most abundant cell type

Assign each bead to its dominant cell type and display with custom colors to highlight the spatial organization of the tumor microenvironment.

import json

# Load color mapping (matching R tutorial's colors_vector.rda)
with open("data/hiresST_CRC/colors_vector.json") as f:
    colors_vector = json.load(f)

# Visualize the most abundant cell type per bead
spacet.visualize_spatial_feature(
    adata,
    spatial_type="MostAbundantCellType",
    spatial_features=["MajorLineage"],
    colors=colors_vector,
    point_size=0.6,
)
CRC most abundant cell type
Each bead colored by its dominant cell type. The high resolution of Slide-seq reveals fine-grained spatial organization of the tumor microenvironment, with clear boundaries between malignant and stromal regions.
Downstream analysis: After deconvolution, you can proceed with cell-cell interaction analysis (spacet.cci_colocalization()), interface identification (spacet.identify_interface()), and gene set scoring (spacet.gene_set_score()) using the same API demonstrated in the Visium Breast Cancer tutorial and Gene Set Score tutorial.

Session Info

import spatialgpu
print(f"spatial-gpu version: {spatialgpu.__version__}")

import anndata, numpy, scipy, pandas, matplotlib
print(f"anndata:     {anndata.__version__}")
print(f"numpy:       {numpy.__version__}")
print(f"scipy:       {scipy.__version__}")
print(f"pandas:      {pandas.__version__}")
print(f"matplotlib:  {matplotlib.__version__}")

try:
    import cupy
    print(f"cupy:        {cupy.__version__} (GPU backend available)")
except ImportError:
    print("cupy:        not installed (CPU-only mode)")
Output
spatial-gpu version: 0.1.0 anndata: 0.10.x numpy: 1.26.x scipy: 1.12.x pandas: 2.2.x matplotlib: 3.8.x cupy: 13.x (GPU backend available)