Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

V1.2.0 de novo sequence representation #141

Open
rzhang90 opened this issue Jan 15, 2024 · 10 comments
Open

V1.2.0 de novo sequence representation #141

rzhang90 opened this issue Jan 15, 2024 · 10 comments

Comments

@rzhang90
Copy link

To represent a de novo peptide, we can set 0 PeptideEvidence in SpectrumIdentificationItem.

As I understand, SpectrumIdentificationItem is embedded in SpectrumIdentificationList and referred by SpectrumIdentification as an attribute. But SpectrumIdentification requires the subelement SearchDatabaseRef, which de novo method doesn't have. How to set SearchDatabaseRef for a de novo type SpectrumIdentification?

Could we also provide a sample file for de novo search type?

Could anyone with de novo type help? Thanks a lot!

@edeutsch
Copy link
Contributor

I think your answers are in the de novo section here: https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5500760/

@rzhang90
Copy link
Author

I think your answers are in the de novo section here: https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5500760/

Hi Eric! Thanks a lot for your quick help. I've read the de novo section in this paper! It mentioned two changes or requirements regarding to de novo.

  1. 0 PeptideEvidence for SpectrumIdentificationItem
  2. SpectrumIdentificationProtocol with "de novo search" as SearchType.

The SpectrumIdentificationProtocol will be referred in SpectrumIdentification as:

image

My concern is about SpectrumIdentification. It requires a 1 or more SearchDatabaseRef subelements. But de novo doesn't have a database. How do we set it?

Thanks again for your help!

@edeutsch
Copy link
Contributor

Ah, I see, sorry, I did not dully understand your question. I do not know the correct way to address this. I will see if we can get some additional attention here.

I suppose one option is to reference a species-appropriate database, which may not contain all identified peptides. In most de novo analyses, there should be a database that provides some context. But this is not very satisfying. And if you are writing a generic de novo tool, it will likely not have access to such a database.

It seems like your choice is to leave it out (and produce a file that would not validate) or make up something untrue to satisfy the validator. I think I would do the first, but I'll see if we can get some more input on this.

@javizca
Copy link
Contributor

javizca commented Jan 17, 2024

We never made an example file for de novo search at the time (there was no-one involved with that particular interest at the time) and we may have missed this?

One possibility would be to "satisfy the validator" by making a "hack", creating a element stating clearly that this is not relevant since it is a de novo search (we could use the accession and id attributes for that). We would need to agree on how to do that, although probably it would not be very "nice" (as Eric said). We would not need to change the schema.

Alternatively, we could also make other changes in the schema in the case of a de novo search, but that would need to be discussed in detail. We could still do it this year since mzIdentML 1.3 is work in progress.

@rzhang90
Copy link
Author

Ah, I see, sorry, I did not dully understand your question. I do not know the correct way to address this. I will see if we can get some additional attention here.

I suppose one option is to reference a species-appropriate database, which may not contain all identified peptides. In most de novo analyses, there should be a database that provides some context. But this is not very satisfying. And if you are writing a generic de novo tool, it will likely not have access to such a database.

It seems like your choice is to leave it out (and produce a file that would not validate) or make up something untrue to satisfy the validator. I think I would do the first, but I'll see if we can get some more input on this.

Thanks Eric! Producing a file that would not validate sounds feasible. I'll try it.
Really appreciate your help. Looking forward to more supports for de novo in mzIdentML 1.3.

@rzhang90
Copy link
Author

We never made an example file for de novo search at the time (there was no-one involved with that particular interest at the time) and we may have missed this?

One possibility would be to "satisfy the validator" by making a "hack", creating a element stating clearly that this is not relevant since it is a de novo search (we could use the accession and id attributes for that). We would need to agree on how to do that, although probably it would not be very "nice" (as Eric said). We would not need to change the schema.

Alternatively, we could also make other changes in the schema in the case of a de novo search, but that would need to be discussed in detail. We could still do it this year since mzIdentML 1.3 is work in progress.

Hi Juan! Thanks a lot for your response. I'll try to add a database file that is not actually used by de novo. Hope we would have more supports for de novo in mzIdentML 1.3. An example mzid file would be very helpful.

Thanks again for your help!

@rzhang90
Copy link
Author

Hi @javizca and @edeutsch , I'm wondering if you have a timeline for v1.3. Thanks.

@javizca
Copy link
Contributor

javizca commented Jan 24, 2024

Version 1.3 specification document is under review at the moment. Also the corresponding paper (see review at https://www.authorea.com/users/672647/articles/671627-mzidentml-1-3-0-essential-progress-on-the-support-of-crosslinking-and-other-identifications-based-on-multiple-spectra).
It is focused in improvements to support cross-linking MS.

If there is enough interest, I guess we could draft an example file for de novo, during the PSI meeting which will take place in Kyoto in March.

@javizca
Copy link
Contributor

javizca commented Mar 20, 2024

Issue discussed during PSI meeting 2024

We have created a CV param in the MS ontology MS:1000394 "de novo search or no database used"

Additionally, in the element, add a called "No database"

We will incorporate this into mzIdentML 1.3.0 for de novo, before the final version

@rzhang90
Copy link
Author

Issue discussed during PSI meeting 2024

We have created a CV param in the MS ontology MS:1000394 "de novo search or no database used"
Additionally, in the element, add a called "No database"

We will incorporate this into mzIdentML 1.3.0 for de novo, before the final version

Hi @javizca and @edeutsch , thanks a lot for the update and looking forward to version 1.3.0!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants