Note

This page was generated from notebooks\Demo-micropools.ipynb. Interactive online version: .

Compass Micropooled Analysis¶

To install the required python packages, you can uncomment the “install_reqs()” call.

[1]:

def install_reqs():
    !pip install pandas
    !pip install numpy
#install_reqs()

[2]:

import sys
if "google.colab" in sys.modules:
  !git clone -b docs https://github.com/YosefLab/Compass.git --depth 1
  !cp -r Compass/notebooks/extdata ./
  !rm -r /content/Compass
  install_reqs()

[3]:

import pandas as pd
import numpy as np

This notebook demonstrates how to analyze the results of Compass when micropooled. In particular, how to determine what cell type each cluster/pool represents.

[4]:

cell_md = pd.read_csv("extdata/Th17/cell_metadata.csv", index_col=0)
reaction_penalties = pd.read_csv("extdata/Th17-micropooled/reactions.tsv", sep="\t", index_col=0)
micropools = pd.read_csv("extdata/Th17-micropooled/micropools.tsv", sep="\t", index_col=0)

[5]:

clusters = {}
for cell in micropools.index:
    mc = micropools.loc[cell, 'microcluster']
    if mc in clusters:
        clusters[mc] += [cell]
    else:
        clusters[mc] = [cell]

[6]:

Th17p, Th17n = {cl:0 for cl in clusters}, {cl:0 for cl in clusters}
for cl in clusters:
    for cell in clusters[cl]:
        cell_type = cell_md.loc[cell, 'cell_type']
        if cell_type == 'Th17p':
            Th17p[cl] += 1
        elif cell_type == 'Th17n':
            Th17n[cl] += 1
        else:
            print("Should not happen")
pctTh17p = {cl:Th17p[cl] / (Th17p[cl] + Th17n[cl]) for cl in clusters}

This gives percentage of the clusters that are Th17p or Th17n. In this case, all of the clusters are one cell type or the other. Then you can apply the regular analysis to the micropooled data and treat each cluster as being the predominant cell type.

[7]:

pctTh17p

[7]:

{10: 1.0,
 24: 1.0,
 12: 1.0,
 11: 1.0,
 14: 1.0,
 9: 1.0,
 27: 1.0,
 23: 1.0,
 28: 1.0,
 15: 1.0,
 26: 1.0,
 25: 1.0,
 16: 1.0,
 13: 1.0,
 4: 0.0,
 3: 0.0,
 7: 0.0,
 20: 0.0,
 18: 0.0,
 2: 0.0,
 21: 0.0,
 17: 0.0,
 6: 0.0,
 8: 0.0,
 0: 0.0,
 5: 0.0,
 1: 0.0,
 19: 0.0,
 22: 0.0}

For dataset the micropooling worked very well and every single pool/cluster is only composed of one cell type. For others you may want to set a cutoff such as 90 percent used in the code below.

[8]:

def mc_type(pct):
    if pct > 0.9:
        return 'Th17p'
    elif pct < 0.1:
        return 'Th17n'
    else:
        return 'Uncertain'
micropool_md = {'cluster_'+str(cl):mc_type(pctTh17p[cl]) for cl in pctTh17p}
micropool_md = pd.DataFrame.from_dict(micropool_md, orient='index', columns=['cell_type'])

[9]:

micropool_md.to_csv("extdata/Th17-micropooled/cluster_metadata.csv")

This now gives a metadata to use when determining what each cluster represents, analgous to the regular cell metadata. Then the data can be analyzed as you would a regular dataset.

For the python notebook we demonstrate an analysis of Compass results with, this can be dome simply by changing the input files. Replace “extdata/Th17/reactions.tsv” with “extdata/Th17-micropooled/reactions.tsv” and “extdata/Th17/cell_metadata.csv” with “extdata/Th17-micropooled/cluster_metadata.csv”.

[10]:

micropool_md

[10]:

	cell_type
cluster_10	Th17p
cluster_24	Th17p
cluster_12	Th17p
cluster_11	Th17p
cluster_14	Th17p
cluster_9	Th17p
cluster_27	Th17p
cluster_23	Th17p
cluster_28	Th17p
cluster_15	Th17p
cluster_26	Th17p
cluster_25	Th17p
cluster_16	Th17p
cluster_13	Th17p
cluster_4	Th17n
cluster_3	Th17n
cluster_7	Th17n
cluster_20	Th17n
cluster_18	Th17n
cluster_2	Th17n
cluster_21	Th17n
cluster_17	Th17n
cluster_6	Th17n
cluster_8	Th17n
cluster_0	Th17n
cluster_5	Th17n
cluster_1	Th17n
cluster_19	Th17n
cluster_22	Th17n

Compass Analysis AWS Tutorial