-
Notifications
You must be signed in to change notification settings - Fork 104
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Import of standard IDT monomers #1588
Comments
olganaz
changed the title
Import modified sequences in IDT format
Import modified sequences from IDT format
Jan 19, 2024
This was referenced Jan 26, 2024
olganaz
changed the title
Import modified sequences from IDT format
Import standard IDT monomers
Apr 3, 2024
7 tasks
4 tasks
AliaksandrDziarkach
added a commit
that referenced
this issue
Apr 9, 2024
Co-authored-by: even1024 <roman.porozhnetov@gmail.com> Co-authored-by: Aliakasndr Dziarkach <Aliakasndr.Dziarkach@gmail.com>
Will be tested in the context of epam/ketcher#4495 |
This was referenced May 15, 2024
Closed
This was referenced May 23, 2024
This was referenced May 29, 2024
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Background
In IDT notation modified sequences represented as a plain strings with a combination of standard and modified monomers.
Standard monomer
[s]<Base>[*]
is nucleotides with the same configurations as supported in Ketcher.Modified monomer
/<pos><Identifier>/[*]
could be nucleotide or CHEM or their combination.This task covers only import of standard IDT monomers.
Requirements
The system should interpret standard monomers
[s]<Base>[*]
as nucleotides with a structure (defined by the name components):Base
- standard unmodified nucleotide base symbol (see table below)s
- optional symbol of the sugar that makes the nucleotide (see table below). If not specified, standard sugar deoxy-ribose (dR) is implied.*
- optional indicator of modified phosphate. If specified, indicates that Phosphorothioate (sP) is included into nucleotide, otherwise standard phosphate (P) is implied.The last monomer in the chain is considered to be nucleoside (nucleotide, which lacks phosphate), so
*
couldn't be the last symbol in the sequence.Solution
A clear and concise description of what you want to happen.
Alternatives
For base symbols R , Y, M, K, S, W, H, B, V, D, N and Modified IDT monomers Error message should be displayed.
Modified IDT monomers
/<pos><Identifier>/[*]
will be supported as described in #1899Additional context
Examples:
ACG
rArC*rG
A*C*G
+C*+G*A
mA*mGC
The text was updated successfully, but these errors were encountered: