Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix: overhaul SAX::Parser encoding handling #3288

Merged
merged 6 commits into from
Jul 7, 2024

Conversation

flavorjones
Copy link
Member

What problem is this PR intended to solve?

Previously, encoding overrides were not implemented for XML::SAX::Parser#parse_memory (as reported in #918) and XML::SAX::Parser#parse_file.

However, this commit goes further and significantly simplifies and unifies the two SAX::ParserContext implementations and the two SAX::Parser implementations.

This commit also allows Encoding objects and encoding names to be passed into the SAX::ParserContext methods, and the XML memory and file methods now accept and properly use passed encodings.

Finally, this commit also backfills a lot of test coverage for the XML and the HTML4 sax parser encoding.

Closes #918

Have you included adequate test coverage?

Yes.

Does this change affect the behavior of either the C or the Java implementations?

Yes, but they are more consistent with each other.

"it's" means "it is", "its" means "belonging to it"
We'll use this in an upcoming commit to simplify the sax parsers
and polyfill xmlSwitchEncodingName. We'll need this functionality in
the next commit.
Previously, encoding overrides were not implemented for
XML::SAX::Parser#parse_memory (as reported in #918) and
XML::SAX::Parser#parse_file.

However, this commit goes further and significantly simplifies and
unifies the two SAX::ParserContext implementations and the two
SAX::Parser implementations.

This commit also allows Encoding objects and encoding names to be
passed into the SAX::ParserContext methods, and the XML memory and
file methods now accept and properly use passed encodings.

Finally, this commit also backfills a lot of test coverage for the XML
and the HTML4 sax parser encoding.

Closes #918
@flavorjones flavorjones force-pushed the 918-sax-parser-encoding branch from 2e99210 to f67b294 Compare July 7, 2024 20:39
@flavorjones flavorjones enabled auto-merge July 7, 2024 21:01
@flavorjones flavorjones merged commit 1ba1db1 into main Jul 7, 2024
131 of 132 checks passed
@flavorjones flavorjones deleted the 918-sax-parser-encoding branch July 7, 2024 21:03
bihorco36 added a commit to puzzle/prawn-markup that referenced this pull request Dec 16, 2024
Due to changes in nokogiri 17, the SAX parser now needs a default
encoding: sparklemotion/nokogiri#3288
bihorco36 added a commit to puzzle/prawn-markup that referenced this pull request Dec 16, 2024
Due to changes in nokogiri 17, the SAX parser now needs a default
encoding: sparklemotion/nokogiri#3288
bihorco36 added a commit to puzzle/prawn-markup that referenced this pull request Dec 17, 2024
Due to changes in nokogiri 17, the SAX parser now needs a default
encoding: sparklemotion/nokogiri#3288
bihorco36 added a commit to puzzle/prawn-markup that referenced this pull request Dec 17, 2024
Due to changes in nokogiri 17, the SAX parser now needs a default
encoding: sparklemotion/nokogiri#3288
bihorco36 added a commit to puzzle/prawn-markup that referenced this pull request Dec 17, 2024
Due to changes in nokogiri 17, the SAX parser now needs a default
encoding: sparklemotion/nokogiri#3288
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
1 participant