-
-
Notifications
You must be signed in to change notification settings - Fork 2.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
fix: decode media names before inserting into the database #18019
base: main
Are you sure you want to change the base?
fix: decode media names before inserting into the database #18019
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A description of the issue and how this fix it would be appreciated. You also said that you had thought about other ways to fix it. What are they?
@@ -2575,7 +2575,8 @@ class NoteEditor : | |||
|
|||
private fun updateField(field: FieldEditText?): Boolean { | |||
val fieldContent = field!!.text?.toString() ?: "" | |||
val correctedFieldContent = NoteService.convertToHtmlNewline(fieldContent, shouldReplaceNewlines()) | |||
// Decode the file name when preparing data for saveNote() to handle special characters correctly |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the //
should follow the surrounding indent
@@ -2575,7 +2575,8 @@ class NoteEditor : | |||
|
|||
private fun updateField(field: FieldEditText?): Boolean { | |||
val fieldContent = field!!.text?.toString() ?: "" | |||
val correctedFieldContent = NoteService.convertToHtmlNewline(fieldContent, shouldReplaceNewlines()) | |||
// Decode the file name when preparing data for saveNote() to handle special characters correctly | |||
val correctedFieldContent = Uri.decode(NoteService.convertToHtmlNewline(fieldContent, shouldReplaceNewlines())) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Won't this affect any kind of encoded content the user has added?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not aware of any instances where users would intentionally add encoded content, but this is something to consider. If necessary, we might need to ensure that only media filenames are decoded rather than applying decoding to all fields.
Do you know of any cases where users might intentionally add encoded content?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
From memory, no. But in Anki there are always someone doing stuff like that
I have updated the PR description for better clarity. Regarding the second question, modifying the editorNote parameter anywhere before passing it to addNote() would effectively resolve the issue. Initially, I tried to fix it by updating the fields parameter in NoteEditor.kt at
But, this introduces an extra iteration, which is unnecessary. If needed, we can update editorNote at any suitable point before calling addNote() to optimize the fix. |
This commit ensures that media file names remain unencoded when stored in the database, as they are already encoded during rendering. This prevents unnecessary encoding and ensures correct handling of special characters in filenames.
Hi @BrayanDSO , Would you be able to take some time to review this? Anki-Android/AnkiDroid/src/main/java/com/ichi2/anki/cardviewer/AndroidCardRenderContext.kt Line 54 in 19df388
The method encodes media file names for rendering the HTML. This implies that encoding media file names during card creation should not be necessary. Based on the comments in the backend code as well : the media file names are not encoded when adding a card in the desktop version and the database contains the actual file name contrary to current implementation of android version. I confirmed this by checking the database as mentioned earlier and also by exporting a card in plain text format, which resulted in:
Proposed Fix : Instead of decoding media file names before storing them in the database, a better solution would be to avoid encoding them when adding cards in the first place, since they are already encoded during rendering. I have already made the necessary changes in the commit, and I kindly ask you to review and verify it. This change will cause a slight difference in the UI. Previously, when an image was attached from the gallery, the encoded file name was saved in the text field. With this fix, the actual file name will be displayed in the text field. Additionally,In my opinion if encoding media file names is necessary anywhere in the code base, instead of using Anki-Android/AnkiDroid/src/main/java/com/ichi2/anki/multimediacard/fields/ImageField.kt Line 83 in 19df388
we should use the following function to maintain consistency:
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for this PR.
Can you please send the commit(s) as you want it/them in the repo. So please, squash them and ensure the description is stating exactly what you are doing and what you do it.
This really needs a regression test. Can you please add one?
And the biggest issue. What will occur to notes that were created with the current version of ankidroi?
Indeed, suppose that:
- you create a card with today buggy version of ankidroid. You add an image with a comma in its name as in the original bug,
- you update ankidroid so that your PR is incorporated,
- you show a card of the note
- you use check media
Is the image still correctly displayed in the reviewer? Is the media still detected as unused in the "check media" ? I don't understand the reviewer enough to know what to expect for the first case. However, I suspect that we'll still offer to delete the media as unused while it's used, because nothing in your code seems to have any impact over the "unused media" feature.
And while it's great that we'll stop introducing buggy entry in the notes, we should ideally be able to deal with all notes that were created in the buggy version of ankidroid, so that we don't break cards of our users.
It'd be also nice that you test a card created with today version of ankidroid on anki desktop. Ensuring that anki desktop does not detect the media as unused. If it does, we'll need to discuss with Damien Elmes what we want to do about it, but ideally, we'd need to update anki so that it does not remove used media that were created by our users when we had a bug
@@ -45,7 +45,7 @@ interface IField : Serializable { | |||
/** | |||
* Returns the formatted value for this field. Each implementation of IField should return in a format which will be | |||
* used to store in the database | |||
* | |||
* - The formatted value now contains the actual file name instead of the encoded name. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't understand the "now". That seems to indicate that during the execution of the program, there is a past time where something else occurred. Not clear what.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Earlier, the formattedValue held the encoded file name within the tag because of its setter behavior in the ImageField.kt file that implements the IField.kt interface. In this commit, it has been updated to contain the actual file name.
I added this comment for this particular commit, but it doesn’t align with the codebase actually, I’ll be removing it in the next commit.
// No need to encode the file name, as it will be encoded during rendering. | ||
// val encodedName = Uri.encode(file.name) | ||
val fileName = file.name | ||
"""<img src="$fileName">""" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
src="${file.name}"
would be fine too, no need to use two lines here.
@@ -80,8 +79,10 @@ class ImageField : | |||
@NeedsTest("files with HTML illegal chars can be imported and rendered") | |||
fun formatImageFileName(file: File): String = |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can you please document the function? I fail to understand when we'd want to return the empty string.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The "else" condition was already present, possibly as a fallback, and I haven't made any changes to that part. Would you like me to take a look at it?
This behavior is consistent across:
This PR does not address the existing issue with the "Check Media" feature. After this PR, the tool will still display the previous buggy entries. To resolve this, we could update the database to store the actual file name instead of the encoded version, or we could explore a better fix..
PS: We are not altering the results from the "Check Media" function; we are simply displaying them as they are. |
@@ -80,8 +79,10 @@ class ImageField : | |||
@NeedsTest("files with HTML illegal chars can be imported and rendered") | |||
fun formatImageFileName(file: File): String = | |||
if (file.exists()) { | |||
val encodedName = Uri.encode(file.name) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If you use git blame and check the commit that introduced this, it tried to fix #15844. Since you are reverting the change, please check if this has created a regression
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@BrayanDSO , I checked, and it seems that the current PR reintroduces the issue. Reverting the change has caused the regression. I'll look into this further.
@Arthur-Milchior @BrayanDSO The current implementation uses Uri.encode() here: Anki-Android/AnkiDroid/src/main/java/com/ichi2/anki/multimediacard/fields/ImageField.kt Line 83 in 90e086f
This resolves the issue of rendering media files with encoded characters in their names. For example, when uploading a file named test%20.png, it gets stored as test%2520.png. As a result, However, this approach causes the Check Media functionality to fail because the database now contains the encoded filename, while the actual file on disk remains unencoded. Anki-Android/AnkiDroid/src/main/java/com/ichi2/anki/cardviewer/AndroidCardRenderContext.kt Line 54 in 90e086f
does not encode media names that already contain encoded characters as expected. For instance, if the file name is test%20.png, it remains unchanged as test%20.png, resulting in <img src='test%20.png'/> . This causes the system to look for test .png (without %20), which does not exist, leading to rendering issues in both the Android and desktop versions.
I’ve raised this issue with Dae (comment here) to determine whether this should be fixed upstream, as both the desktop and Android versions are affected. Proposed Fix (If No Upstream Changes) This would work as follows:
I’ve tested this approach, and it appears to work as expected. That said, I’ll wait for Dae’s response before proceeding with this PR. Let me know if you have any suggestions. |
Hello 👋, this PR has had no activity for more than 2 weeks and needs a reply from the author. If you think this is a mistake please comment and ping a maintainer to get this merged ASAP! Thanks for contributing! You have 7 days until this gets closed automatically |
Issue:
Check Media shows incorrect values when media filenames contain special characters (spaces, %, &, =, etc.) on Android, while the desktop version works as expected.
Cause:
According to the Anki manual (https://docs.ankiweb.net/media.html#media), filenames with spaces or special characters appear differently in the HTML editor compared to how they are stored on disk. For example,
hello 100%.jpg
is displayed ashello%20100%25.jpg
in the editor.HTML encoding is likely applied because certain characters have special meanings in URLs and HTML syntax. Without proper encoding, the card rendering engine may fail to locate or display media files with special characters in their names.
For instance, if a file named
test,.jpg
is referenced in a card, it gets encoded astest%2C.jpg
in the editor for proper rendering, while the actual file on disk remainstest,.jpg
.Observations:
collection.anki2
database (notes
table).Scrrenshot of the notes table :
This discrepancy causes the
check_media
function to return incorrect results for media files added from Android when their names include special characters.Approach
Now, while adding card in a deck
addNote()
function is calledhttps://github.com/Sahil06012002/Anki-Android/blob/fb3eec394463e19f0b3db3fdaa5b6b1eb156879a/AnkiDroid/src/main/java/com/ichi2/anki/NoteEditor.kt#L1198
that stores the note in the database.
Decoding the media filenames before storing them in the database resolves the issue.
Before calling
addNote(editorNote!!, deckId)
, thefields
parameter ineditorNote
is updated to replace encoded filenames with their decoded versions :https://github.com/Sahil06012002/Anki-Android/blob/fb3eec394463e19f0b3db3fdaa5b6b1eb156879a/AnkiDroid/src/main/java/com/ichi2/anki/NoteEditor.kt#L2579 .
Fixes
How Has This Been Tested?
https://github.com/user-attachments/assets/721fba7e-07d0-4523-a851-3f8bfc7f6910
Checklist