ssGSEA: Accept a Series with gene names as index and return a dataframe #27

olgabot · 2017-08-02T00:08:30Z

Hello,
I'm very excited about a Python implementation of ssGSEA! I'd like to convert my gene expression values to pathway expression values by using applyto perform ssGSEA on every row of a pandas.DataFrame expression matrix. Right now, this looks like this:

expression.head().apply(
    lambda x: gp.ssgsea(x.reset_index(), gene_sets=gmt), 
    axis=1)

The reset_index() seems unnecessary for every row
Each run of ssgsea returns None, rather than returning the converted pathway enrichment. I'd rather not have to read a file for every single sample I have (~6,000 of them), so can ssgsea return the Series instead?

Warmest,
Olga

The text was updated successfully, but these errors were encountered:

olgabot · 2017-08-02T00:24:28Z

It also seems unnecessary to perform the gene set filtering every single time for every sample, but rather do it once for all samples.

zqfang · 2017-08-02T08:26:00Z

Hi, @olgabot ,

reset_index() is necessary only if your input is a pd.Series, I could fixed this to support pd.Series with gene_names as index(install the latest PR)
actually, ssgsea() return a ssGSEA object, which has many attributes. sorry for the unclear docs. Please try:

    #for example:
    ss = gp.ssgsea(x, gene_sets=gmt)
     #res2d attr is a dataframe contains all final enrichment results. see fig blow.
     ss.res2d
     #results attr is a ordeddict contains all internal statistical testing values     
     ss.results

You are right, the filtering part could be done once. Even so, the most time-consuming part is calculating the statistics on null distributions. ssGSEA module is just designed for single run now.
I will try to add a patch to improve filter rules soon.

Thank you very much for your great advice.

…l apply now

zqfang · 2017-08-25T06:30:43Z

now, gseapy 0.8.4 supports gct formats, series, and dataframe with only 1 column(index as gene symbols):

gss =  gp.ssgsea(expression, gene_sets="KEGG_2016")
# to get all results from a dict
gss.resultsOnSamples

zqfang added the enhancement label Aug 11, 2017

zqfang pushed a commit that referenced this issue Aug 11, 2017

refer to #27,gct expression matrix support for ssgsea, no need to cal…

c76621d

…l apply now

zqfang closed this as completed Oct 8, 2017

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

ssGSEA: Accept a Series with gene names as index and return a dataframe #27

ssGSEA: Accept a Series with gene names as index and return a dataframe #27

olgabot commented Aug 2, 2017

olgabot commented Aug 2, 2017

zqfang commented Aug 2, 2017 •

edited

Loading

zqfang commented Aug 25, 2017 •

edited

Loading

ssGSEA: Accept a Series with gene names as index and return a dataframe #27

ssGSEA: Accept a Series with gene names as index and return a dataframe #27

Comments

olgabot commented Aug 2, 2017

olgabot commented Aug 2, 2017

zqfang commented Aug 2, 2017 • edited Loading

zqfang commented Aug 25, 2017 • edited Loading

zqfang commented Aug 2, 2017 •

edited

Loading

zqfang commented Aug 25, 2017 •

edited

Loading