Error when attempting to run run clippy.py #1

SitrucL · 2020-12-10T21:52:17Z

Tried running the python script and passed in the .txt file as instructed on the git page & YT video however, I keep encountering this error:

Please let me know if there's anything that I can do to resolve the issue

Additional info:
Python 3.91
Kindle PW
Windows 10

dangbert · 2020-12-10T23:46:39Z

Hi, I'd like to help get this fixed for you. Would you mind sharing your "My Clippings.txt" file to help me replicate the issue?

Based on this I'm guessing your file is encoded in a weird format the program doesn't expect.

One simple test could be to copy and paste the contents of your file (open it in a notepad) into a new file which you then provide to clippy.py

Let me know how it goes.

SitrucL · 2020-12-11T14:13:33Z

Hi dangbert,

I tried paste the contents into a new .txt file but that didnt seem to resolve the issue. Here is a copy of my clippings.txt, perhaps there are some characters within the file that are causing the issue. I wasnt too sure from the error message which particular characters/lines to look at though.

Thanks in advance

My Clippings.txt

dangbert · 2020-12-14T03:56:08Z

Interestingly when I downloaded your file I didn't run into that encoding issue (although I'm using Ubuntu 20 and Python 3.8.5). Either uploading the file to github changed its encoding, or this particular issue has something to do with our python versions differing or running on windows vs ubuntu.

However, when I ran clippy.py on your file, I ran into another issue which stems from something I've already suspected: some different versions of Kindle format "My Clippings.txt" in slightly different ways.

Here you can see the difference between a highlight in my file (top) and yours (bottom):

Specifically the problem is that your file lists both a page number and a location range for every highlight, currently my program fails to parse your file as it expects either just a page number to be provided or a location range instead but not both.

I built this program originally because the existing alternatives I tried didn't parse my Kindle's "My Clippings.txt" file correctly. The "My Clippings.txt" format is not great, and the fact that it seems to vary across different Kindle versions and language settings is a pain when it comes to parsing it.

But it's my goal to make clippy-kindle robust to different versions of kindles (and eventually to the language setting used as well). I can try to spend some time this week reworking the parser to support your file format, and then we can further investigate the encoding issue. Sorry for the troubles :/

AmmarShaqeel · 2021-04-01T10:34:02Z

I've had issues with this as well.
This is because the "My Clippings.txt" seems to have 4 different formats for the highlights.

Normal:

- Your Highlight at location 1177-1178 | Added on Monday, 19 June 2017 02:21:10

Very rare, short or blank highlights:

- Your Highlight at location 1177 | Added on Monday, 19 June 2017 02:21:10

Rare:

- Your Highlight on page 7 | Added on Monday, 19 June 2017 02:21:10

If book has pages (not all publishers seem to have to the trouble of adding pages):

- Your Highlight on page 22 | location 325-325 | Added on Thursday, 15 June 2017 18:23:21

Something else to note is that it seems vary between "Your highlight at location" and "Your highlight on page"

dangbert · 2021-04-04T06:15:47Z

The parsing issues related to this issue are addressed and fixed now after #5 and c53fec7, thank you @AmmarShaqeel for the contribution!

However this doesn't address the original encoding problem in the OP. I haven't experienced this issue on Ubuntu 20 with this "My Clippings.txt" file, but I can try to replicate by running this in Windows...

Edit: I'm also experiencing the same "charmap" codec error when running in windows (windows 8, python 3.6.5)

Update: Nov 10, 2022:
The repo needs to be refactored / tested to work on windows. I will use github actions with onwidows to troubleshoot this (as I don't have a windows machine)...

If i remember correctly the issue had something to do with the datetime library...

TODO: add chardet to requirements.txt, test in linux

AmmarShaqeel mentioned this issue Apr 1, 2021

Update parsing to cover 2 alternative formats for 'My highlights' #5

Merged

dangbert added a commit that referenced this issue Apr 4, 2021

Add parsing support for the remaining edge cases surfaced by issue #1

c53fec7

dangbert added a commit that referenced this issue Apr 7, 2021

Fix file decoding in windows (issue #1)

55dc896

TODO: add chardet to requirements.txt, test in linux

dangbert added the bug Something isn't working label Sep 12, 2022

Repository owner deleted a comment from DaDaQingFeng Jan 31, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Error when attempting to run run clippy.py #1

Error when attempting to run run clippy.py #1

SitrucL commented Dec 10, 2020

dangbert commented Dec 10, 2020 •

edited

Loading

SitrucL commented Dec 11, 2020

dangbert commented Dec 14, 2020

AmmarShaqeel commented Apr 1, 2021 •

edited

Loading

dangbert commented Apr 4, 2021 •

edited

Loading

Error when attempting to run run clippy.py #1

Error when attempting to run run clippy.py #1

Comments

SitrucL commented Dec 10, 2020

dangbert commented Dec 10, 2020 • edited Loading

SitrucL commented Dec 11, 2020

dangbert commented Dec 14, 2020

AmmarShaqeel commented Apr 1, 2021 • edited Loading

dangbert commented Apr 4, 2021 • edited Loading

dangbert commented Dec 10, 2020 •

edited

Loading

AmmarShaqeel commented Apr 1, 2021 •

edited

Loading

dangbert commented Apr 4, 2021 •

edited

Loading