Inputs

ELLA mainly takes two pandas data frames as inputs.

1. Gene expression with nuclear center
A pandas data frame (expr) with a few columns:

  • spatial gene expression including the coordinates (x, y) and the corresponding counts (umi)
  • cell center (centerX, centerY)
  • cell type (type), cell ID (cell), gene ID (gene)
  • and the total number expression counts of cells (sc_total), rows corresponding to the same cell should have the same value for this column.

Here’s how the data frame looks like:

And here’s how the expression of one gene (dots in red) and the cell center (crosses in red) looks like:


2. Cell segmentation
A pands data frame (cell_seg) with 3 columns:

  • cell ID (cell)
  • the coordinates of points that characterize the cell segmentation boundary (cell_seg).

Here’s how the data frame looks like:

And here’s how it actually looks like (in red solid line):


Nucleus segmentation
[Optional, for visualization purpose ONLY]

A pands dataframe (nucleus_seg) with 3 columns:

  • cell ID (cell)
  • the coordinates of points that characterize the nucleus segmentation boundary (nucleus_seg).

Here’s how the data frame looks like:

And here’s how it actually looks like (in darkgray dashed line):


Other required inputs

  • types a list corresponding to all cell types.
  • cells a dictionary of lists corresponding to list of cells in each cell type.
  • cells_all a list of all cells across cell types.
  • genes a dictionary of lists corresponding to list of genes in each cell type.

How about tweak your own data into the format that ELLA takes and have a try!