Common function to map from input fields to common fields #6

jorainer · 2019-04-23T09:30:37Z

Please correct me if I got this wrong: the idea is to have map data from different input sources to a commonly agreed set of fields and an object that can hold this data. So, the workflow would be:

read input file.
map names of the input file to commonly accepted names.
put that into a result object.

So, 1) would be an input type specific function and its result should be a named list of the file's elements. 2) uses the schema for the mapping, hence, this could be a single function for all parsers, right? 3) this one would also be a single function as I see it.

meowcat · 2019-04-23T11:13:19Z

Input and output, importantly. Otherwise I think you are correct.

I actually envisioned it slightly differently: 1) read input file, 2) map into a result (Spectrum/Spectra/...) object using the corresponding formats' nomenclature/sytem/hierarchy, 3) map the Spectrum/Spectra/result object with custom names to a Spectrum/Spectra/result object with common names. Your workflow is more consistent because mine requires processing the actual peaks separately from / before all other information. It removes an intermediate that I think of as useful, but maybe I can figure out how to work without it.

jorainer · 2019-04-23T12:34:51Z

Do you have already a function that converts the names provided by the input file to the common names using the schema?

I think that function will be a key one that we need - it should also be fast, if possible.

meowcat · 2019-04-23T12:43:33Z

We are not as quickly progressing here, unfortunately, since I have to fit this work into my regular work somehow. Also my first implementation will certainly not be a fast one.

jorainer · 2019-04-23T12:44:44Z

No prob. Was not sure if I just overlooked that one.

Treutler · 2019-04-23T13:58:47Z

Please keep in mind that there are multiple field names for the same value in case of (at least) Nist .*msp and Bruker .library. E.g. the instrument in the NIST.msp format can be

Instrument
Synon: $:07
Comments: instrument

I encoded this in the table as Instrument / Synon: $:07 / Comments: instrument.
Accordingly, we have to (i) support these different flavors for the import and (ii) decide which flavor to export.

meowcat · 2019-04-23T14:05:43Z

Accordingly, we have to (i) support these different flavors for the import and (ii) decide which flavor to export.

(i) could be feasible by doing something like this:

- field: Synon
  node:
   - field: $:70
     map_read: instrument

or map: instrument, type: readonly. There will also be cases of nested mapping, where a sub-entry in one record format is a toplevel entry in general (e.g. possibly INCHIKEY depending on how we define it.)

(ii) I guess every schema needs to choose a canonical export format.

Treutler · 2019-04-24T05:31:10Z

(ii) I guess every schema needs to choose a canonical export format.

Agreed. I adjusted the fields in the table so that the first field is meant to be the canonical export format.

meowcat mentioned this issue Apr 23, 2019

Common schema format #7

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Common function to map from input fields to common fields #6

Common function to map from input fields to common fields #6

jorainer commented Apr 23, 2019

meowcat commented Apr 23, 2019

jorainer commented Apr 23, 2019 •

edited

Loading

meowcat commented Apr 23, 2019

jorainer commented Apr 23, 2019

Treutler commented Apr 23, 2019 •

edited

Loading

meowcat commented Apr 23, 2019

Treutler commented Apr 24, 2019 •

edited

Loading

Common function to map from input fields to common fields #6

Common function to map from input fields to common fields #6

Comments

jorainer commented Apr 23, 2019

meowcat commented Apr 23, 2019

jorainer commented Apr 23, 2019 • edited Loading

meowcat commented Apr 23, 2019

jorainer commented Apr 23, 2019

Treutler commented Apr 23, 2019 • edited Loading

meowcat commented Apr 23, 2019

Treutler commented Apr 24, 2019 • edited Loading

jorainer commented Apr 23, 2019 •

edited

Loading

Treutler commented Apr 23, 2019 •

edited

Loading

Treutler commented Apr 24, 2019 •

edited

Loading