-
Notifications
You must be signed in to change notification settings - Fork 128
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
augur curate I/O: verify NDJSON records have the same fields #1510
Comments
Probably sensible to do this for outputs too, to assert the curate subcommand's |
Some more details about TSV writing via our generalised
|
to match expected behaviour in tests. The main changes functional changes are around the order of fields, where we now rename "in-place" rather than adding the renamed column at the end (which for TSV output is the last column). More sanity checks are performed on arguments and they are cross-referenced with the provided records. Note that this relies on each record having the same fields, and this is not asserted here. See <#1510>
to match expected behaviour in tests. The main changes functional changes are around the order of fields, where we now rename "in-place" rather than adding the renamed column at the end (which for TSV output is the last column). More sanity checks are performed on arguments and they are cross-referenced with the provided records. Note that this relies on each record having the same fields, and this is not asserted here. See <#1510>
to match expected behaviour in tests. The main changes functional changes are around the order of fields, where we now rename "in-place" rather than adding the renamed column at the end (which for TSV output is the last column). More sanity checks are performed on arguments and they are cross-referenced with the provided records. Note that this relies on each record having the same fields, and this is not asserted here. See <#1510>
to match expected behaviour in tests. The main changes functional changes are around the order of fields, where we now rename "in-place" rather than adding the renamed column at the end (which for TSV output is the last column). More sanity checks are performed on arguments and they are cross-referenced with the provided records. Note that this relies on each record having the same fields, and this is not asserted here. See <#1510>
to match expected behaviour in tests. The main changes functional changes are around the order of fields, where we now rename "in-place" rather than adding the renamed column at the end (which for TSV output is the last column). More sanity checks are performed on arguments and they are cross-referenced with the provided records. Note that this relies on each record having the same fields, and this is not asserted here. See <#1510>
to match expected behaviour in tests. The main changes functional changes are around the order of fields, where we now rename "in-place" rather than adding the renamed column at the end (which for TSV output is the last column). More sanity checks are performed on arguments and they are cross-referenced with the provided records. Note that this relies on each record having the same fields, and this is not asserted here. See <#1510>
to match expected behaviour in tests. The main changes functional changes are around the order of fields, where we now rename "in-place" rather than adding the renamed column at the end (which for TSV output is the last column). More sanity checks are performed on arguments and they are cross-referenced with the provided records. Note that this relies on each record having the same fields, and this is not asserted here. See <#1510>
to match expected behaviour in tests. The main changes functional changes are around the order of fields, where we now rename "in-place" rather than adding the renamed column at the end (which for TSV output is the last column). More sanity checks are performed on arguments and they are cross-referenced with the provided records. Note that this relies on each record having the same fields, and this is not asserted here. See <#1510>
Per discussion on #1508 and #1511, the field standardization across records (cf #1510) makes the need to verify a `database` field less important — essentially, if there's a `geo_loc_name` field (or a field with the name given in the `--location-field` argument), parse it. Otherwise, warn that it's not found.
to match expected behaviour in tests. The main changes functional changes are around the order of fields, where we now rename "in-place" rather than adding the renamed column at the end (which for TSV output is the last column). More sanity checks are performed on arguments and they are cross-referenced with the provided records. Note that this relies on each record having the same fields, and this is not asserted here. See <#1510>
to match expected behaviour in tests. The main changes functional changes are around the order of fields, where we now rename "in-place" rather than adding the renamed column at the end (which for TSV output is the last column). More sanity checks are performed on arguments and they are cross-referenced with the provided records. Note that this relies on each record having the same fields, and this is not asserted here. See <#1510>
Context
Originally discussed in #1506 (comment)
augur curate
records can be output to a metadata TSV file, which uses the first record's fields as output columns.augur/augur/io/metadata.py
Lines 467 to 474 in f6ee377
With that in mind, the centralized inputs parser should verify that the input records all of the same fields.
Then subcommands can operate under the assumption that all records should have the same fields and make changes accordingly.
The text was updated successfully, but these errors were encountered: