`.replace_text` does not work as intended. #223

jymchng · 2023-03-24T17:30:51Z

L105    pdf.replace_text(2, "jdoe123@mycompany.net", "hello WORLD").unwrap();
L106    dbg!(pdf.extract_text(&[2]).unwrap());

Logs

[src\redact.rs:106] pdf.extract_text(&[2]).unwrap() = "For example, john.doe@example.com, jdoe123@mycompany.net, \nalice_123+test@gmail.co.uk, and jane\n-\ndoe@my\n-\nuniversity.edu all match this pattern, \nand are therefore considered valid email addresses.\n \n \n"

Apparently, directly replacing text in a page doesn't work?

The text was updated successfully, but these errors were encountered:

jymchng · 2023-03-27T04:11:08Z

@J-F-Liu Hi J-F-Liu, just thinking about this replace_text method that returns a Result<()> - it means there is a contract between the caller and callee such that if replace_text indeed does replace the text in the .pdf, it returns an Ok(()), else it returns an Err variant.

For this function, particularly on Line 138, it seems that the function does nothing when the encoding is not within the pre-defined 'able-to-parse' encodings ("Tf" or "Tj"), the match arm _ => {} evaluates to an empty scope. Would it be better to return an Err so that the caller knows it is not getting what the function promises to do because it is unable to parse any other type of encodings?

jymchng · 2023-03-27T04:12:49Z

#217

J-F-Liu · 2023-03-27T12:06:14Z

Yes, text processing is not implemented completely.

Co-authored-by: Lukáš Tyrychtr <ltyrycht@redhat.com>

jymchng · 2024-03-31T10:54:20Z

@J-F-Liu Hi Liu, do you think this issue can be fixed?

Heinenen · 2024-08-09T23:03:55Z

Theoretically this can be fixed, but sadly, extracting text from a PDF is hard.
The solution for #125 may lay a first foundation for solving this issue, as it will allow to (sometimes) extract the text from the PDF.
However, this is only half of the solution, we would still need to implement putting the replacement text back into the PDF.

jymchng referenced this issue Apr 15, 2023

Replace unmaitained encoding crate with encoding_rs (#222)

99cb2a4

Co-authored-by: Lukáš Tyrychtr <ltyrycht@redhat.com>

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

`.replace_text` does not work as intended. #223

`.replace_text` does not work as intended. #223

jymchng commented Mar 24, 2023

jymchng commented Mar 27, 2023 •

edited

Loading

jymchng commented Mar 27, 2023

J-F-Liu commented Mar 27, 2023

jymchng commented Mar 31, 2024

Heinenen commented Aug 9, 2024

.replace_text does not work as intended. #223

.replace_text does not work as intended. #223

Comments

jymchng commented Mar 24, 2023

jymchng commented Mar 27, 2023 • edited Loading

jymchng commented Mar 27, 2023

J-F-Liu commented Mar 27, 2023

jymchng commented Mar 31, 2024

Heinenen commented Aug 9, 2024

`.replace_text` does not work as intended. #223

`.replace_text` does not work as intended. #223

jymchng commented Mar 27, 2023 •

edited

Loading