Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

SelectorParseException is thrown on attempt to get cssSelector for element with class conating *: #2169

Closed
valfirst opened this issue Jul 15, 2024 · 1 comment
Assignees
Labels
Milestone

Comments

@valfirst
Copy link

Steps to reproduce:

package org.example;

import org.jsoup.Jsoup;
import org.jsoup.nodes.Document;
import org.jsoup.nodes.Element;

public class Main
{
    public static void main(String[] args)
    {
        String html = "<!DOCTYPE html><html><body>\n"
                      + "  <img class=\"vds-flex_1 vds-d_block lg:vds-d_flex vds-flex_column vds-items_flex-end [&amp;_>_*:first-child]:vds-pt_0\" href=\"https://any.host/gif.gif\"></img>\n"
                      + "</body></html>";
        Document document = Jsoup.parse(html);
        Element img = document.body().child(0);

        // Next line will throw org.jsoup.select.Selector$SelectorParseException
        String cssSelector = img.cssSelector();
    }
}

Exception:

org.jsoup.select.Selector$SelectorParseException: Could not parse query 'img.vds-flex_1.vds-d_block.lg\:vds-d_flex.vds-flex_column.vds-items_flex-end.\[\&_\>_*\:first-child\]\:vds-pt_0': unexpected token at '\:first-child\]\:vds-pt_0'
	at org.jsoup.select.QueryParser.consumeEvaluator(QueryParser.java:184)
	at org.jsoup.select.QueryParser.parse(QueryParser.java:75)
	at org.jsoup.select.QueryParser.parse(QueryParser.java:46)
	at org.jsoup.select.QueryParser.combinator(QueryParser.java:91)
	at org.jsoup.select.QueryParser.parse(QueryParser.java:61)
	at org.jsoup.select.QueryParser.parse(QueryParser.java:46)
	at org.jsoup.select.Selector.select(Selector.java:102)
	at org.jsoup.nodes.Element.select(Element.java:475)
	at org.jsoup.nodes.Element.cssSelectorComponent(Element.java:948)
	at org.jsoup.nodes.Element.cssSelector(Element.java:929)
	at org.example.Main.main(Main.java:18)
@jhy jhy self-assigned this Jul 16, 2024
@jhy jhy added the fixed label Jul 16, 2024
@jhy jhy added this to the 1.18.2 milestone Jul 16, 2024
@jhy jhy closed this as completed in 69d2e43 Jul 16, 2024
@jhy
Copy link
Owner

jhy commented Jul 16, 2024

Thanks, fixed. We were passing the wrong set of allowed characters into the escape function.

Related to #2146 - we weren't handling escaped characters correctly when escaping. But prior to that we weren't escaping * because it wasn't in the (wrong) list. So this probably worked. And that changed in #1811.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

2 participants