-
Notifications
You must be signed in to change notification settings - Fork 9
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Training PREP error: AttributeError: 'Annotation' object has no attribute 'seq' #223
Comments
Hi @sfcooke96 glad to hear you are trying to TweetyNet and vak. You've done everything the right way.
The fix for now is to convert your annotations to a sequence. Here's an example of how you would do this with a Raven selection table. import crowsetta
import numpy as np
example = crowsetta.data.get('raven')
raven = crowsetta.formats.bbox.Raven.from_file(example.annot_path, annot_col='Species')
annot = raven.to_annot()
onsets_s = []
offsets_s = []
labels = []
for bbox in annot.bboxes:
onsets_s.append(bbox.onset)
offsets_s.append(bbox.offset)
labels.append(bbox.label)
onsets_s = np.array(onsets_s)
offsets_s = np.array(offsets_s)
labels = np.array(labels)
simpleseq = crowsetta.formats.seq.SimpleSeq(
onsets_s=onsets_s,
offsets_s=offsets_s,
labels=labels,
annot_path='/dummy/path'
)
simpleseq.to_file('example-data.csv') Please test this out for me and let me know if it works. I'm afraid you might actually get an error if you do this with your data, because of the extra columns, such as the extracted features. If so, this would be a bug we need to fix in crowsetta; we want you to be able to work with any selection table that Raven saves as a txt file. If you do get such a bug, please report it at https://github.com/vocalpy/crowsetta/issues. I really appreciate it--we tried to make things easy for Raven users but I have to admit I haven't spent a lot of time with it yet. There is a workaround, which would be to load the txt file directly with pandas and then convert it to a simple-seq annotation the same way I did above. I can reply with a snippet showing you how if we need to. I am happy to help you get this figured out here on this issue, but just for future reference, this repo mainly exists for the paper, and TweetyNet is now built into vak. You can ask questions in the VocalPy forum, and you can report bugs / request features on the vak issue tracker. |
Hi @NickleDave , thanks for getting back! I will post any future questions in the repositories you mentioned. After a few edits I have the following python script: import crowsetta
import numpy as np
example = crowsetta.data.get('raven')
raven = crowsetta.formats.bbox.raven.Raven.from_file(example.annot_path, annot_col='Species')
annot = raven.to_annot()
onsets_s = []
offsets_s = []
labels = []
for bbox in annot.bboxes:
onsets_s.append(bbox.onset)
offsets_s.append(bbox.offset)
labels.append(bbox.label)
onsets_s = np.array(onsets_s)
offsets_s = np.array(offsets_s)
labels = np.array(labels)
simpleseq = crowsetta.formats.seq.SimpleSeq(
onsets_s=onsets_s,
offsets_s=offsets_s,
labels=labels,
annot_path='/User/training_data'
)
simpleseq.to_csv('annotations.csv') What I'm running into is the following error:
Would the appropriate solution be to create a dataframe using pandas and write this to a .csv file following the instructions here: https://vak.readthedocs.io/en/latest/howto/howto_user_annot.html#howto-user-annot? Another question while we're here: will training the model on simple-seq annotations restrict the predicted annotations to onset - offset borders without including high and low frequency bounds? I'm interested because I was hoping to estimate frequency ranges with the output data. Apologies if I'm misunderstanding how prediction output will be formatted. Thanks again! |
Whoops, my fault, that should have been
That was my next question for you. We have it on the to-do list for vak version 1.0 to add object detection models, which would give you frequency bounds. But those will only get added after some other development work in progress. AFAIK the main model people use when they want low/high freq bounds is Deepsqueak but it's only in Matlab. I haven't seen any Python implementations yet (but see my notes in the linked issue about OD models for some ideas). There's several deep learning frameworks that are meant for more general bioacoustics but AFAIK they only output "detections" as defined here.
You might also look at Tessa's repo for other options: https://github.com/rhine3/bioacoustics-software Sorry we can't help you more right now! 🙁 |
Closing this based on your reply here @sfcooke96 Please don't hesitate to reach out on the forums if you have more questions, and of course if you have a bug / need a feature feel free to raise an issue on the appropriate repo |
Hi there,
First time using tweetynet and vak. I'm attempting to use vak to train tweetynet on a small dataset of raven .txt files and .wav files and I can't seem to get past
vak prep gy6or6_train.toml
I keep getting the error:
Traceback (most recent call last):
File "/User/miniconda3/envs/vak-env/bin/vak", line 10, in
sys.exit(main())
File "/User/miniconda3/envs/vak-env/lib/python3.9/site-packages/vak/main.py", line 48, in main
cli.cli(command=args.command, config_file=args.configfile)
File "/User/miniconda3/envs/vak-env/lib/python3.9/site-packages/vak/cli/cli.py", line 54, in cli
COMMAND_FUNCTION_MAPcommand
File "/User/miniconda3/envs/vak-env/lib/python3.9/site-packages/vak/cli/cli.py", line 28, in prep
prep(toml_path=toml_path)
File "/User/miniconda3/envs/vak-env/lib/python3.9/site-packages/vak/cli/prep.py", line 122, in prep
dataset_df, dataset_path = prep_module.prep(
File "/User/miniconda3/envs/vak-env/lib/python3.9/site-packages/vak/prep/prep_.py", line 216, in prep
dataset_df, dataset_path = prep_parametric_umap_dataset(
File "/User/miniconda3/envs/vak-env/lib/python3.9/site-packages/vak/prep/parametric_umap/parametric_umap.py", line 214, in prep_parametric_umap_dataset
dataset_df, shape = prep_unit_dataset(
File "/User/miniconda3/envs/vak-env/lib/python3.9/site-packages/vak/prep/unit_dataset/unit_dataset.py", line 345, in prep_unit_dataset
annot_labelset = set(annot.seq.labels)
AttributeError: 'Annotation' object has no attribute 'seq'
I have attached an example of the training .txt file as well as a copy of the ....train.toml file for reference.
Any help is appreciated!
Stephen
copyof_train.toml.pdf
[example_data.pdf]
(https://github.com/yardencsGitHub/tweetynet/files/13982834/example_data.pdf)
The text was updated successfully, but these errors were encountered: