Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Potential bug in panaroo #303

Closed
Hocnonsense opened this issue Aug 2, 2024 · 1 comment
Closed

Potential bug in panaroo #303

Hocnonsense opened this issue Aug 2, 2024 · 1 comment
Labels
bug Something isn't working

Comments

@Hocnonsense
Copy link

I'm reading the code of panaroo, and I've found something difficult to understand

One is in cdhit.py, is it a typo for "as=AS"? it may be

            run_cdhit(
                ...
                aL=aL,
                AL=AL,
                aS=aS,
                AS=AS,
                ...
            )

in

panaroo/panaroo/cdhit.py

Lines 186 to 210 in e928a7a

if dna:
run_cdhit_est(input_file=temp_input_file.name,
output_file=temp_output_file.name,
id=id,
s=s,
aL=aL,
AL=AL,
aS=AS,
accurate=accurate,
use_local=use_local,
strand=strand,
quiet=quiet,
n_cpu=n_cpu)
else:
run_cdhit(input_file=temp_input_file.name,
output_file=temp_output_file.name,
id=id,
s=s,
aL=aL,
AL=AL,
aS=AS,
accurate=accurate,
use_local=use_local,
quiet=quiet,
n_cpu=n_cpu)

Another is in generate_network.py, I think genes at the end of contigs should be marked as node['hasEnd']=True

for i, id in enumerate(seq_ids):
current_cluster = seq_to_cluster[id]
seqid_to_centroid[id] = cluster_centroids[current_cluster]
loc = id.split("_")
genome_id = int(loc[0])
if loc[-1] == "0":
# we're at the start of a contig
if prev is not None: G.nodes[prev]['hasEnd'] = True
prev = current_cluster
if G.has_node(prev) and (prev not in paralogs):

However, at the end of cycle, the last gene was ignored.
prev = current_cluster
return G, centroid_context, seqid_to_centroid

Is it necessary to add another check for it such as:

    if prev is not None:  # end of cycle
        G.nodes[prev]["hasEnd"] = True

    return G, centroid_context, seqid_to_centroid

Thanks if you could reply!

@gtonkinhill
Copy link
Owner

Hi,

Sorry for the delayed reply, and thank you very much for flagging these issues.

The first is indeed a typo, which I will fix in the next release. However, it should not affect the output, as Panaroo only uses global alignment (use_local=False), which overrides this parameter.

The second issue appears to be a potential bug, although it would likely have a very small impact on the output. It could result in at most one miscalled gene if an annotation error occurred in the last gene processed by Panaroo. I will investigate this further and fix it if necessary in the next release.

@gtonkinhill gtonkinhill added the bug Something isn't working label Aug 16, 2024
gtonkinhill added a commit that referenced this issue Sep 20, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

2 participants