cell2home - create reference and compute homing scores

[1]:

import sys
import scanpy as sc
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import matplotlib
import seaborn as sns
from matplotlib import rcParams

import scipy.sparse
import mudata
import muon as mu
import session_info

import cell2home as c2h

%config InlineBackend.figure_format = 'retina'

sc.set_figure_params(dpi=80)
rcParams["axes.grid"] = False

/home/jovyan/my-conda-envs/cell2home/lib/python3.8/site-packages/tqdm/auto.py:21: TqdmWarning: IProgress not found. Please update jupyter and ipywidgets. See https://ipywidgets.readthedocs.io/en/stable/user_install.html
  from .autonotebook import tqdm as notebook_tqdm

The demo notebook makes use of two publicly available objects, download these into the same directory as the notebook before proceeding:

https://cellgeni.cog.sanger.ac.uk/pan-immune/CountAdded_PIP_T_object_for_cellxgene.h5ad

https://cellgeni.cog.sanger.ac.uk/pan-immune/myeloid.h5ad

[2]:

# Create a list to hold the pairs
pairs = []
# Iterate through the dictionary to create pairs
for key, values in c2h.markers.items():
    for value in values:
        pairs.append((key, value))

# Create a DataFrame from the pairs
interactions = pd.DataFrame(pairs, columns=['target', 'source'])
interactions

[2]:

	target	source
0	CCR1	CCL3
1	CCR1	CCL5
2	CCR1	CCL7
3	CCR1	CCL8
4	CCR1	CCL14
...	...	...
89	ITGB7	FN1
90	ITGB7	MADCAM1
91	ITGB7	VCAM1
92	ITGB7	CDH1
93	ITGAE	CDH1

94 rows × 2 columns

[3]:

# Load reference data (these are the cells expressing ligands that attract the migrating cells)
# here T cells from cross-tissue immune atlas (Dominguez-Conde et al. Science 2022)
adata = sc.read_h5ad('CountAdded_PIP_T_object_for_cellxgene.h5ad')
adata.X = adata.layers["counts"]
del adata.layers["counts"]

[4]:

adata

[4]:

AnnData object with n_obs × n_vars = 216611 × 36601
    obs: 'Organ', 'Donor', 'Chemistry', 'Predicted_labels_CellTypist', 'Majority_voting_CellTypist', 'Manually_curated_celltype', 'Sex', 'Age_range'
    uns: 'Age_range_colors', 'Sex_colors'
    obsm: 'X_umap'

[5]:

sc.pl.umap(adata, color='Manually_curated_celltype')

/home/jovyan/my-conda-envs/cell2home/lib/python3.8/site-packages/scanpy/plotting/_tools/scatterplots.py:394: UserWarning: No data for colormapping provided via 'c'. Parameters 'cmap' will be ignored
  cax = scatter(

../_images/notebooks_001_cell2home_demo_6_1.png

[6]:

# Construct signatures from selected cell types
signatures = c2h.construct_signatures(adata, 'Manually_curated_celltype', interactions)
signatures

[6]:

	population	source	expression	target
0	Cycling T&NK	CCL3	0.881343	CCR1
1	Cycling T&NK	CCL3	0.881343	CCR3
2	Cycling T&NK	CCL3	0.881343	CCR5
3	ILC3	CCL3	-0.758298	CCR1
4	ILC3	CCL3	-0.758298	CCR3
...	...	...	...	...
1687	Trm_Tgd	CDH1	2.585478	ITGAE
1688	Trm_Th1/Th17	CDH1	0.798664	ITGB7
1689	Trm_Th1/Th17	CDH1	0.798664	ITGAE
1690	Trm_gut_CD8	CDH1	0.491016	ITGB7
1691	Trm_gut_CD8	CDH1	0.491016	ITGAE

1692 rows × 4 columns

[7]:

# Load query data (these are the migrating cells expressing chemokine receptors)
# here cross-tissue immune atlas (Dominguez-Conde et al. Science 2022)
adata = sc.read_h5ad('/lustre/scratch126/cellgen/teichmann/od5/datasets/myeloid.h5ad')

[8]:

# Compute migration scores for each cell in query data against the generated reference
mdata = c2h.compute_cell_scores(adata, signatures)

/home/jovyan/my-conda-envs/cell2home/lib/python3.8/site-packages/anndata/_core/anndata.py:1840: UserWarning: Variable names are not unique. To make them unique, call `.var_names_make_unique`.
  utils.warn_names_duplicates("var")
/home/jovyan/my-conda-envs/cell2home/lib/python3.8/site-packages/anndata/_core/anndata.py:1840: UserWarning: Variable names are not unique. To make them unique, call `.var_names_make_unique`.
  utils.warn_names_duplicates("var")

[9]:

# Collapse scores at the level of population and target=receptor
# (this is done for later analysis aggregating signals at the receptor level)
c2h.collapse_scores(mdata, ["population", "target"], collapse_key="cell2home_target_affinity")

[10]:

# Collapse scores at the level of population (this is the final result)
c2h.collapse_scores(mdata, "population",
                    score_key="cell2home_target_affinity",
                    collapse_key="cell2home_affinity"
                   )

[11]:

mdata.update()
mdata

[11]:

MuData object with n_obs × n_vars = 51552 × 38779
  4 modalities
    rna:    51552 x 36601
      obs:  'Organ', 'Donor', 'Chemistry', 'Predicted_labels_CellTypist', 'Majority_voting_CellTypist', 'Manually_curated_celltype'
      obsm: 'X_umap'
    cell_scores:    51552 x 1692
      obs:  'Organ', 'Donor', 'Chemistry', 'Predicted_labels_CellTypist', 'Majority_voting_CellTypist', 'Manually_curated_celltype'
      var:  'population', 'source', 'target'
      obsm: 'X_umap'
    cell2home_target_affinity:      51552 x 468
      obs:  'Organ', 'Donor', 'Chemistry', 'Predicted_labels_CellTypist', 'Majority_voting_CellTypist', 'Manually_curated_celltype'
      var:  'population', 'target'
    cell2home_affinity:     51552 x 18
      obs:  'Organ', 'Donor', 'Chemistry', 'Predicted_labels_CellTypist', 'Majority_voting_CellTypist', 'Manually_curated_celltype'
      var:  'population'

[12]:

# List populations for plotting
populations = mdata['cell2home_affinity'].var_names.unique().tolist()

[13]:

# Visualise scores as heatmap
sc.pl.matrixplot(mdata['cell2home_affinity'],
                 var_names=populations,
                 groupby='Manually_curated_celltype'
                )

../_images/notebooks_001_cell2home_demo_14_0.png

[14]:

# Scaling often works better
sc.pl.matrixplot(mdata['cell2home_affinity'],
                 var_names=populations,
                 groupby='Manually_curated_celltype',
                 standard_scale='var'
                )

../_images/notebooks_001_cell2home_demo_15_0.png

Here colour intensity shows predicted affinity of myeloid cells on the y axis towards various T and NK cell subsets on x axis.

[15]:

# Plot affinity of the myeloid cells towards NK_CD56bright_CD16-, and specific affinity through XCR1
mu.pl.embedding(mdata,
                basis="rna:X_umap",
                color=["cell2home_affinity:NK_CD56bright_CD16-", "cell2home_target_affinity:NK_CD56bright_CD16-:XCR1", "rna:Manually_curated_celltype"],
                cmap='inferno',
                vmax='p99'
               )

/home/jovyan/my-conda-envs/cell2home/lib/python3.8/site-packages/scanpy/plotting/_tools/scatterplots.py:394: UserWarning: No data for colormapping provided via 'c'. Parameters 'cmap' will be ignored
  cax = scatter(

../_images/notebooks_001_cell2home_demo_17_1.png

We find that NK_CD56bright_CD16- are predicted to attract Alveolar macrophages and DC1, and that affinity towards DC1 is mediated by XCR1

[16]:

# Extract matrix for plotting clustered heatmap

A = mdata["cell2home_affinity"]                       # your AnnData
g = A.obs["Manually_curated_celltype"]

# group means (rows = groups, cols = populations in your order)
M = sc.get.obs_df(A, keys=list(populations)).assign(_g=g.values).groupby("_g").mean()[list(populations)]
if pd.api.types.is_categorical_dtype(g):      # keep the displayed row order
    M = M.reindex(g.cat.categories)

# standard_scale='var' → min–max per column
M_plot = ((M - M.min()) / (M.max() - M.min())).fillna(0.0)

[17]:

# Prioritise strong signals for visualisation
M_plot_stronger = M_plot.pow(2.0)          # try 2–3; higher => harsher

[18]:

sns.clustermap(M_plot_stronger.T, col_cluster=True, row_cluster=True, figsize = (8,8))

[18]:

<seaborn.matrix.ClusterGrid at 0x7fa699f08a90>

../_images/notebooks_001_cell2home_demo_21_1.png

[19]:

# Find top interactions predicted to drive migration of a cell type to a target tissue/cell

top = c2h.compute_top_interactions_one_group(
    mdata,
    modality="cell_scores",
    identity="NK_CD56bright_CD16-",
    var_key="population",
    split_obs_key="Manually_curated_celltype",
    group_identity="DC1",
    top_n=10,
)
top

[19]:

XCL1–XCR1       1.420306
XCL2–XCR1       1.277904
CXCL12–CXCR4    0.788314
ICAM1–ITGAL     0.479316
CXCL14–CXCR4    0.339726
ICAM3–ITGAL     0.285965
F11R–ITGAL      0.271622
CXCL9–CXCR3     0.181886
ICAM2–ITGAL     0.134578
CCL25–CCR9      0.023951
dtype: float32

[20]:

c2h.plot_top_interactions_bar_one_group(
    top,
    title="Top ligand–receptor pairs | population=NK_CD56bright_CD16- | DC1",
    cmap="Reds",
    figsize=(3,3)
)

../_images/notebooks_001_cell2home_demo_23_0.png

[21]:

# Save chemokine2cell object
mdata.write("cell2home_demo.h5mu")

[22]:

session_info.show()

[22]:

Click to view session information

-----
anndata             0.9.2
cell2home           0.0.1
matplotlib          3.7.5
mudata              0.2.3
muon                0.1.6
numpy               1.24.4
pandas              2.0.3
scanpy              1.9.6
scipy               1.10.1
seaborn             0.11.2
session_info        v1.0.1
-----

Click to view modules imported as dependencies

PIL                 10.4.0
asttokens           NA
backcall            0.2.0
backports           NA
cycler              0.12.1
cython_runtime      NA
dateutil            2.9.0
debugpy             1.8.5
decorator           5.1.1
executing           2.1.0
get_annotations     NA
h5py                3.11.0
importlib_metadata  NA
importlib_resources NA
ipykernel           6.14.0
jaraco              NA
jedi                0.19.1
joblib              1.4.2
kiwisolver          1.4.7
llvmlite            0.41.1
matplotlib_inline   0.1.7
more_itertools      10.3.0
mpl_toolkits        NA
natsort             8.4.0
numba               0.58.1
packaging           26.0
parso               0.8.4
patsy               1.0.2
pexpect             4.9.0
pickleshare         0.7.5
platformdirs        4.3.6
prompt_toolkit      3.0.48
psutil              6.0.0
ptyprocess          0.7.0
pure_eval           0.2.3
pydev_ipython       NA
pydevconsole        NA
pydevd              2.9.5
pydevd_file_utils   NA
pydevd_plugins      NA
pydevd_tracing      NA
pygments            2.18.0
pynndescent         0.6.0
pyparsing           3.1.4
pytz                2026.1.post1
setuptools          75.3.0
six                 1.16.0
sklearn             1.3.2
stack_data          0.6.2
statsmodels         0.14.1
threadpoolctl       3.5.0
tornado             6.4.1
tqdm                4.67.3
traitlets           5.14.3
umap                0.5.7
wcwidth             0.2.13
zipp                NA
zmq                 26.2.0

-----
IPython             8.4.0
jupyter_client      8.6.3
jupyter_core        5.8.1
-----
Python 3.8.20 | packaged by conda-forge | (default, Sep 30 2024, 17:52:49) [GCC 13.3.0]
Linux-4.15.0-213-generic-x86_64-with-glibc2.17
-----
Session information updated at 2026-03-09 17:36

[ ]: