Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

if "-" in word and word[0] != "-": TypeError: argument of type 'float' is not iterable #41

Open
dzianisv opened this issue Oct 9, 2024 · 3 comments

Comments

@dzianisv
Copy link
Contributor

dzianisv commented Oct 9, 2024

I tried to vocalize the book https://library.lol/main/7AB96C0626A64D7AFA33F68DC6D97CE3.

--- attribution: 111.759 seconds ---
--- name coref: 13.472 seconds ---
--- coref: 110.698 seconds ---
--- TOTAL (excl. startup): 382.830 seconds ---, 258918 words
deleted file: /Users/engineer/Downloads/Neil Howe - The Fourth Turning Is Here _ What the Seasons of History Tell Us about How and When This Crisis Will End-Simon & Schuster (2023).txt because its not needed anymore after the ebook convertsion to txt
Success, File processed successfully!
Processed 3306 lines.
Removed 0 problematic lines.
Wrote 3306 lines back to the file.
Saved nonquotes.csv to Working_files/Book/non_quotes.csv
All processing complete!
Traceback (most recent call last):
  File "/Volumes/Backup/Users/engineer/workspace/VoxNovel/gui_run.py", line 696, in <module>
    main()
  File "/Volumes/Backup/Users/engineer/workspace/VoxNovel/gui_run.py", line 691, in main
    process_files(q_file, matching_entities_files[0])
  File "/Volumes/Backup/Users/engineer/workspace/VoxNovel/gui_run.py", line 619, in process_files
    if is_pronoun(mention):
  File "/Volumes/Backup/Users/engineer/workspace/VoxNovel/gui_run.py", line 596, in is_pronoun
    tagged_word = nltk.pos_tag([word])
  File "/Users/engineer/miniconda3/envs/VoxNovel/lib/python3.10/site-packages/nltk/tag/__init__.py", line 169, in pos_tag
    return _pos_tag(tokens, tagset, tagger, lang)
  File "/Users/engineer/miniconda3/envs/VoxNovel/lib/python3.10/site-packages/nltk/tag/__init__.py", line 126, in _pos_tag
    tagged_tokens = tagger.tag(tokens)
  File "/Users/engineer/miniconda3/envs/VoxNovel/lib/python3.10/site-packages/nltk/tag/perceptron.py", line 194, in tag
    context = self.START + [self.normalize(w) for w in tokens] + self.END
  File "/Users/engineer/miniconda3/envs/VoxNovel/lib/python3.10/site-packages/nltk/tag/perceptron.py", line 194, in <listcomp>
    context = self.START + [self.normalize(w) for w in tokens] + self.END
  File "/Users/engineer/miniconda3/envs/VoxNovel/lib/python3.10/site-packages/nltk/tag/perceptron.py", line 303, in normalize
    if "-" in word and word[0] != "-":
TypeError: argument of type 'float' is not iterable
@DrewThomasson
Copy link
Owner

DrewThomasson commented Oct 10, 2024

Yeah some books do that...

And I'm not sure why....

I think there's something wrong with the logic in my code for that or something... cause it's super weird some books do that and some don't

Even when the file given is an EPUB which is the one that has best compatibility

I'll see when I can get to hunting that down

In the meantime I'll keep this issue open to remind me lol

  • I guess try some other books tho in the meantime hm...

@dzianisv
Copy link
Contributor Author

dzianisv commented Oct 10, 2024

In this case word for some reason is not array, but float, at nltk/tag/perceptron.py

@DrewThomasson DrewThomasson pinned this issue Oct 10, 2024
@MrDesjardins
Copy link

MrDesjardins commented Oct 14, 2024

I also have the same issue with my book (EPUB exported using Amazon Kindle tool) I was trying to convert. Here are the few logs before the crash.

Cannot resolve quotation
--- attribution: 2.777 seconds ---
--- name coref: 0.078 seconds ---
--- coref: 43.001 seconds ---
--- TOTAL (excl. startup): 191.703 seconds ---, 244422 words
deleted file: /home/user1/code/voxnovel/AudioBook.txt because its not needed anymore after the ebook convertsion to txt
Success, File processed successfully!
Processed 323 lines.
Removed 0 problematic lines.
Wrote 323 lines back to the file.
Saved nonquotes.csv to Working_files/Book/non_quotes.csv
All processing complete!
Traceback (most recent call last):
  File "/home/user1/VoxNovel/gui_run.py", line 696, in <module>
    main()
  File "/home/user1/VoxNovel/gui_run.py", line 691, in main
    process_files(q_file, matching_entities_files[0])
  File "/home/user1/VoxNovel/gui_run.py", line 619, in process_files
    if is_pronoun(mention):
  File "/home/user1/VoxNovel/gui_run.py", line 596, in is_pronoun
    tagged_word = nltk.pos_tag([word])
  File "/home/user1/miniconda3/lib/python3.10/site-packages/nltk/tag/__init__.py", line 169, in pos_tag
    return _pos_tag(tokens, tagset, tagger, lang)
  File "/home/user1/miniconda3/lib/python3.10/site-packages/nltk/tag/__init__.py", line 126, in _pos_tag
    tagged_tokens = tagger.tag(tokens)
  File "/home/user1/miniconda3/lib/python3.10/site-packages/nltk/tag/perceptron.py", line 194, in tag
    context = self.START + [self.normalize(w) for w in tokens] + self.END
  File "/home/user1/miniconda3/lib/python3.10/site-packages/nltk/tag/perceptron.py", line 194, in <listcomp>
    context = self.START + [self.normalize(w) for w in tokens] + self.END
  File "/home/user1/miniconda3/lib/python3.10/site-packages/nltk/tag/perceptron.py", line 303, in normalize
    if "-" in word and word[0] != "-":

Note that I had to do manually:

pip install pandas

Because it was complaining when running the python gui_run.py command.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants