Inputs
ELLA mainly takes two pandas data frames as inputs.
1. Gene expression with nuclear center
A pandas data frame (expr
) with a few columns:
- spatial gene expression including the coordinates (
x
,y
) and the corresponding counts (umi
) - cell center (
centerX
,centerY
) - cell type (
type
), cell ID (cell
), gene ID (gene
) - and the total number expression counts of cells (
sc_total
), rows corresponding to the same cell should have the same value for this column.
Here’s how the data frame looks like:
And here’s how the expression of one gene (dots in red) and the cell center (crosses in red) looks like:
2. Cell segmentation
A pands data frame (cell_seg
) with 3 columns:
- cell ID (
cell
) - the coordinates of points that characterize the cell segmentation boundary (
cell_seg
).
Here’s how the data frame looks like:
And here’s how it actually looks like (in red solid line):
Nucleus segmentation
[Optional, for visualization purpose ONLY]
A pands dataframe (nucleus_seg
) with 3 columns:
- cell ID (
cell
) - the coordinates of points that characterize the nucleus segmentation boundary (
nucleus_seg
).
Here’s how the data frame looks like:
And here’s how it actually looks like (in darkgray dashed line):
Other required inputs
types
a list corresponding to all cell types.cells
a dictionary of lists corresponding to list of cells in each cell type.cells_all
a list of all cells across cell types.genes
a dictionary of lists corresponding to list of genes in each cell type.
How about tweak your own data into the format that ELLA takes and have a try!