Replies: 1 comment 3 replies
-
I'm still on vacation, but there's a quiet time while I wait for my parents to return from a dentist visit, so I thought I would throw out some ideas... I'm thinking about this in terms of barriers to submission. One barrier is: whether or not the researcher can submit. The other is whether we can load. RN, we allow submission without the researcher needing to fill out all the compound info, but even though we allow this, we must interact with them to confirm compound info, so whether or not they can submit is not a real barrier to focus on. In fact, allowing the researcher to handle the compound info, speeds up the process, so the question is (IMO), how do we streamline that process? One thought I've had is that perhaps (hear me out here), relying on (or requiring) a compound record to exist, is something we can eliminate. After all, it's a "guess" by the researcher, and we never change the peak group name that's submitted, regardless of the assigned compound record(s) that's linked to the PeakGroup. The formula, stored in the PeakGroup record is pretty much all that we need to make tracebase work, so I propose the following. We may not do this, but I think it's at least valuabkle as a perspective through which to view this issue:
This would allow Tracebase to work without all of the compound lookup requirements, and provide an avenue to make those edits after loading (at the researcher's leisure). |
Beta Was this translation helpful? Give feedback.
-
We have previously tried to construct a master list of compound names, but this still was not enough to cover even more Compounds / Synonyms encountered in a recent submission. I had to spend a couple of hours manually looking up HMDB IDs...here I'll summarize my process, then describe ideas from Rob, and then add my own discussion.
how I manually searched for HMDB IDs to identify compounds from a new submission
The Study Doc contained a Compounds Sheet which indicated every unique Compound Name + Formula in the Annotated Files which I had submitted. If an entry was found in Tracebase, then the Tracebase synonyms and corresponding HMDB ID were also entered into the same Compounds sheet. Unidentified compounds had an empty HMDB ID.
For each compound, I first checked to see if this compound is just a new synonym for an existing Tracebase compound. Specifically, I searched for the formula and/or name in the Compounds table on the live Tracebase site. If the correct HMDB was found, I copied this to the sheet. If it was not in Tracebase, I searched HMDB using the compound name.
Eventually, I filled the missing HMDB IDs with the correct value (or a not-available when I could not find an HMDB ID).
Ideas to handle this
In validation, attempt to handle unidentified compounds:
Alternatively, could we just dump the entire HMDB ID names + formulas + synonyms into Tracebase? This would avoid building a lookup feature into tracebase, but might require periodic updates as the HMDB list grows. It's also decently large (I think about 200k HMDB compounds).
Beta Was this translation helpful? Give feedback.
All reactions