Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Ask a question-问题请教 #6

Open
moyu3003 opened this issue Nov 2, 2022 · 0 comments
Open

Ask a question-问题请教 #6

moyu3003 opened this issue Nov 2, 2022 · 0 comments

Comments

@moyu3003
Copy link

moyu3003 commented Nov 2, 2022

Hello Authors.

Regarding your publications "Inductive transfer learning for molecula activity prediction : Next-Gen QSAR Models with MolPMoFiT"and

"SMILES Pair Encoding: a Data-Driven Substructure Tokenization Algorithm for Deep Learning".

I have encountered the following problems in duplicating your work and would like to ask you for advice.

  1. In the first paper, what is the coding basis of the data enhancement part in the code utils.py you uploaded, and how the enhanced molecules are determined to have the same properties as the original molecules; also, I would like to ask what is the reason for the partial error in this code.

  2. In the second paper, you used the SPE form to divide the molecules, which is higher than the ECFP coding form in terms of effect, but is the sub-structure accurate in terms of interpretation; I also want to ask, after the molecules are divided in this part, what is the form of data input to the network model.

  3. Can you share a complete code.

I hope to get your reply, thank you very much!

Translated with www.DeepL.com/Translator (free version)

Lu

2022.11.02

作者您好:

关于您发表的《SMILES Pair Encoding: A Data-Driven Substructure Tokenization Algorithm for Deep Learning》和

《Inductive transfer learning for molecula activity prediction : Next-Gen QSAR Models with MolPMoFiT》期刊,

我在重复您的工作过程中遇到了以下问题,特此向您请教;

1、在第二篇文献中,您上传的代码utils.py中的数据增强部分的编码依据是什么,增强后的分子如何确定与原分子具有相同的属性;同时想问一下,不知道是什么原因该代码存在部分错误;

2、在第一篇文献中,您使用SPE形式对分子进行划分,在效果上是高于ECFP编码形式,但是在解释上子结构是否准确;同时想问一下,该部分对分子划分后,是以什么形式进行数据输入到网络模型当中的;

3、是否可以分享一份完整代码。

希望得到您的回复,非常感谢!

陆同学

2022.11.02

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant