Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix invalid results in output #23

Merged
merged 2 commits into from
Sep 21, 2024
Merged

Fix invalid results in output #23

merged 2 commits into from
Sep 21, 2024

Conversation

AndreaGuarracino
Copy link
Member

This fixes an annoying bug that has been strangely hidden for a long time that affects all output formats. When a target range is not found in a query, no results should be emitted. By not doing any checks on the results, a lot of invalid ranges were returned (wrong ranges, ranges with length 0, or empty projected CIGAR strings).

I report an example with the PAF format as output that shows also the wrong target ranges (outside the interval used to query the alignments):

# number of wrong results
impg -p chr6.y1.s10p95.paf -r grch38#chr6:31972057-32055418 -P | grep 'cg:Z:$' -c     
28092

# number of valid results
impg -p chr6.y1.s10p95.paf -r grch38#chr6:31972057-32055418 -P | grep 'cg:Z:$' -cv
232

# A few invalid results
impg -p chr6.y1.s10p95.paf -r grch38#chr6:31972057-32055418 -P | grep 'cg:Z:$' | head | column -t     
HG03579#2#JAGYVT010000047.1  26762912  15990000  16040000  +  grch38#chr6  170805979  15961422  16011615  0  0  255  gi:f:NaN  bi:f:NaN  cg:Z:
HG02080#1#JAHEOW010000032.1  56349618  15980000  16030000  +  grch38#chr6  170805979  15961437  16011454  0  0  255  gi:f:NaN  bi:f:NaN  cg:Z:
NA18906#2#JAHEON010000095.1  24213007  15940000  15990000  +  grch38#chr6  170805979  15962138  16012310  0  0  255  gi:f:NaN  bi:f:NaN  cg:Z:
HG01071#2#JAHBCE010000081.1  25027143  15930000  15980000  +  grch38#chr6  170805979  15962428  16012508  0  0  255  gi:f:NaN  bi:f:NaN  cg:Z:
HG03486#1#JAHEOQ010000114.1  19325645  0         8340000   +  grch38#chr6  170805979  15962549  16012596  0  0  255  gi:f:NaN  bi:f:NaN  cg:Z:
HG01952#2#JAHAMD010000016.1  59464697  15980000  16030000  +  grch38#chr6  170805979  15962782  16012783  0  0  255  gi:f:NaN  bi:f:NaN  cg:Z:
HG02723#2#JAHEOT010000038.1  23679014  10940000  10990000  +  grch38#chr6  170805979  15962796  16012895  0  0  255  gi:f:NaN  bi:f:NaN  cg:Z:
HG00438#1#JAHBCB010000067.1  24808650  15980000  16030000  +  grch38#chr6  170805979  15963320  16013504  0  0  255  gi:f:NaN  bi:f:NaN  cg:Z:
HG02559#2#JAGYVJ010000064.1  59272483  15940000  15990000  +  grch38#chr6  170805979  15963723  16013901  0  0  255  gi:f:NaN  bi:f:NaN  cg:Z:
HG02486#2#JAGYVL010000026.1  60776310  15850000  15900000  +  grch38#chr6  170805979  15963767  16013866  0  0  255  gi:f:NaN  bi:f:NaN  cg:Z:

@AndreaGuarracino AndreaGuarracino changed the title Do not add empty results Fix invalid results in output Sep 21, 2024
@AndreaGuarracino AndreaGuarracino merged commit 9a1092a into main Sep 21, 2024
2 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant