-
-
Notifications
You must be signed in to change notification settings - Fork 5.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
regex not threadsafe #31521
Comments
context: #31309 (comment) |
Am I correct that this is why the program at https://salsa.debian.org/benchmarksgame-team/benchmarksgame/issues/143 only gives correct output if the const variants = (
"agggtaaa|tttaccct",
"[cgt]gggtaaa|tttaccc[acg]",
"a[act]ggtaaa|tttacc[agt]t",
"ag[act]gtaaa|tttac[agt]ct",
"agg[act]taaa|ttta[agt]cct",
"aggg[acg]aaa|ttt[cgt]ccct",
"agggt[cgt]aa|tt[acg]accct",
"agggta[cgt]a|t[acg]taccct",
"agggtaa[cgt]|[acg]ttaccct"
)
const subs = (
(r"tHa[Nt]", "<4>"),
(r"aND|caN|Ha[DS]|WaS", "<3>"),
(r"a[NSt]|BY", "<2>"),
(r"<[^>]*>", "|"),
(r"\|[^|][^|]*\|", "-")
)
function perf_regex_dna(io)
seq = read(stdin, String)
l1 = length(seq)
seq = replace(seq, r">.*\n|\n" => "")
l2 = length(seq)
variant_counts = zeros(Int64, length(variants))
@inbounds Threads.@threads for i in 1:length(variants)
variant_counts[i] = length(collect(eachmatch(Regex(variants[i]), seq)))
end
for (v, k) in zip(variants, variant_counts)
write(io, v, ' ', string(k), '\n')
end#for
for (u, v) in subs
seq = replace(seq, u => v)
end
write(io, '\n', string(l1), '\n', string(l2), '\n', string(length(seq)), '\n')
end
perf_regex_dna(stdout) |
😁 |
@non-Jedi Yes, I would say so. Please try it and see if it works now. |
Sorry for the delay in replying. Just tested with 1.3 (where I assume this commit made it into), and the bug is gone. |
There's a race condition for creating the regex JIT cache from different threads.
The text was updated successfully, but these errors were encountered: