-
Notifications
You must be signed in to change notification settings - Fork 4
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
BUG: Issue converting raven.txt file to simple-seq #261
Comments
Hi @sfcooke96! Thank you for providing a detailed bug report and the zip with a couple samples to test with. 🙏 I think I might have confused you with my snippet on the other issue. When you use your data, you'll want to specify the path to those files as the first argument to crowsetta.formats.bbox.Raven.from_file(
'troubleshooting/data1.txt'
) I was able to do this and load the file without issue. You'll also need to loop over all your files and save each of them with a separate name, so you don't overwrite the previous one you saved. import pathlib
import crowsetta
import numpy as np
# this is where we get our files from
src_dir = pathlib.Path('./troubleshooting')
# next line: sorted because
# https://www.vice.com/en/article/zmjwda/a-code-glitch-may-have-caused-errors-in-more-than-100-published-studies
src_txt_files = sorted(src_dir.glob('*.txt'))
# this is where we save the files (so we don't overwrite the originals)
dst_dir = pathlib.Path('./annots-simple-seq')
dst_dir.mkdir(exist_ok=True)
# to save ourselves from a typo
assert dst_dir != src_dir
for txt_file in src_txt_files:
print(
f"Converting Raven file to simple-seq format: {txt_file}"
)
annot = crowsetta.formats.bbox.Raven.from_file(
txt_file
).to_annot()
onsets_s = []
offsets_s = []
labels = []
for bbox in annot.bboxes:
onsets_s.append(bbox.onset)
offsets_s.append(bbox.offset)
labels.append(bbox.label)
onsets_s = np.array(onsets_s)
offsets_s = np.array(offsets_s)
labels = np.array(labels)
simpleseq = crowsetta.formats.seq.SimpleSeq(
onsets_s=onsets_s,
offsets_s=offsets_s,
labels=labels,
annot_path='/dummy/path/doesnt/matter/here'
)
dst_txt_file = dst_dir / txt_file.name
print(
f"Saving converted simple-seq file: {dst_txt_file}"
)
simpleseq.to_file(dst_txt_file) Just let me know if you have any questions about what this is doing!
Re: the TweetyNet model, please see my reply on the issue on the TweetyNet repo: yardencsGitHub/tweetynet#223 (comment) |
@NickleDave, thank you - this solution seems to have worked! On to prepping, training, and predicting. Thanks a lot for your active support here! 🙏 |
Of course, glad to hear it's working @sfcooke96! |
Hi there @NickleDave,
I'm running the following on a MAC with crowsetta V 5.0.1
I tried using the following script (suggested here: yardencsGitHub/tweetynet#223) to convert my raven.txt files to simple-seq for use with vak and tweetynet.
After running this I got:
AttributeError: 'SimpleSeq' object has no attribute 'to_csv'
I adjusted the script slightly (raven = .... , simplest.to_file...) to the following:
I have 10 .txt files in my directory (> 15 rows per file) to be written into simple-seq format but the resulting output is the following (this is complete):
I tried adjusting the above code
raven = crowsetta.formats.bbox.**raven**.Raven.from_file(example.annot_path, annot_col='Species)
By changing annot_col to 'Annotation' - the header for the annotation col in my .txt files. - and received the following output:
I've attached example data here, the python script, and output file. troubleshooting.zip
Another question while we're here: will training the model on simple-seq annotations restrict the predicted annotations to onset - offset borders without including high and low frequency bounds? I'm interested because I was hoping to estimate frequency ranges with the output data. Apologies if I'm misunderstanding how prediction output will be formatted.
Thanks for your help!
The text was updated successfully, but these errors were encountered: