Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Some problem of reproducing Fig.5 results #67

Open
JaneJiayiDong opened this issue May 11, 2022 · 1 comment
Open

Some problem of reproducing Fig.5 results #67

JaneJiayiDong opened this issue May 11, 2022 · 1 comment

Comments

@JaneJiayiDong
Copy link

Hello, sorry for bothering. I am facing some issues in reproducing the results of Fig.5 of the paper. I downloaded the data (BEELINE-data and Networks) from Zenodo and used the the generateExpInputs.py.

  1. I used the expression data (mESC) and the network(Non-Specific-ChIP-seq-network.csv), and set other parameter as default. The mistake is as follows:
Traceback (most recent call last):
  File "generateExpInputs_raw.py", line 171, in <module>
    print("\n#TFs: %d, #Genes: %d, #Edges: %d, Density: %.3f" % (nTFs,nGenes,netDF.shape[0],netDF.shape[0]/((nTFs*nGenes)-nTFs)))
ZeroDivisionError: division by zero

I found that the Gene names in Non-Specific-ChIP-seq-network.csv are uppercase, which is different from ExpressionData.csv, so I add
expr_df.index = expr_df.index.to_series().apply(lambda x:x.upper())
before
expr_df.to_csv(opts.outPrefix+'-ExpressionData.csv')
The result is:
#TFs: 27, #Genes: 144, #Edges: 264, Density: 0.068

  1. After looking the issues Fail to reproduce Fig.5 results for human data #65 , I try to reproduce the results for the hESC datasets using the STRING ground truth net, and the result is:
    #TFs: 28, #Genes: 82, #Edges: 112, Density: 0.049

I need some help for these problems. Maybe there are some steps for data preprocessing while I ignore them, please give me some advice.

Thank you
Best wishes
Jiayi Dong

@JaneJiayiDong
Copy link
Author

After my check again, I found that it is just a simple error. If the following modifications are made, we can get the same results as the Fig 5.

print("\nReading %s" % (expr_file))
expr_df = pd.read_csv(expr_file, header=0, index_col=0)
expr_df.index = expr_df.index.to_series().apply(lambda x:x.upper())
print("\nReading %s" % (gene_ordering_file))
gene_df = pd.read_csv(gene_ordering_file, header=0, index_col=0)
gene_df.index = gene_df.index.to_series().apply(lambda x:x.upper())

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant