diff --git a/src/main/java/org/jsoup/select/Selector.java b/src/main/java/org/jsoup/select/Selector.java index c050ff1366..9a5f5ff275 100644 --- a/src/main/java/org/jsoup/select/Selector.java +++ b/src/main/java/org/jsoup/select/Selector.java @@ -37,18 +37,18 @@ * [attr*=valContaining]elements with an attribute named "attr", and value containing "valContaining"a[href*=/search/] * [attr~=regex]elements with an attribute named "attr", and value matching the regular expressionimg[src~=(?i)\\.(png|jpe?g)] * The above may be combined in any orderdiv.header[title] - *

Combinators

+ *

Combinators

* E Fan F element descended from an E elementdiv a, .logo h1 * E {@literal >} Fan F direct child of Eol {@literal >} li * E + Fan F element immediately preceded by sibling Eli + li, div.head + div * E ~ Fan F element preceded by sibling Eh1 ~ p * E, F, Gall matching elements E, F, or Ga[href], div, h3 - *

Pseudo selectors

+ *

Pseudo selectors

* :lt(n)elements whose sibling index is less than ntd:lt(3) finds the first 3 cells of each row * :gt(n)elements whose sibling index is greater than ntd:gt(1) finds cells after skipping the first two * :eq(n)elements whose sibling index is equal to ntd:eq(0) finds the first cell of each row * :has(selector)elements that contains at least one element matching the selectordiv:has(p) finds divs that contain p elements.
div:has(> a) selects div elements that have at least one direct child a element.
section:has(h1, h2) finds section elements that contain a h1 or a h2 element - * :is(selector list)elements that match any of the selectors in the selector list:is(h1, h2, h3, h4, h5, h6) finds any heading element.
:is(section, article) > :is(h1, h2) finds a h1 or h2 that is a direct child of a section or an article + * :is(selector list)elements that match any of the selectors in the selector list:is(h1, h2, h3, h4, h5, h6) finds any heading element.
:is(section, article) > :is(h1, h2) finds a h1 or h2 that is a direct child of a section or an article * :not(selector)elements that do not match the selector. See also {@link Elements#not(String)}div:not(.logo) finds all divs that do not have the "logo" class.

div:not(:has(div)) finds divs that do not contain divs.

* :contains(text)elements that contains the specified text. The search is case insensitive. The text may appear in the found element, or any of its descendants. The text is whitespace normalized.

To find content that includes parentheses, escape those with a {@code \}.

p:contains(jsoup) finds p elements containing the text "jsoup".

{@code p:contains(hello \(there\) finds p elements containing the text "Hello (There)"}

* :containsOwn(text)elements that directly contain the specified text. The search is case insensitive. The text must appear in the found element, not any of its descendants.p:containsOwn(jsoup) finds p elements with own text "jsoup". @@ -63,7 +63,7 @@ *

Structural pseudo selectors

* :rootThe element that is the root of the document. In HTML, this is the html element:root * :nth-child(an+b)

elements that have an+b-1 siblings before it in the document tree, for any positive integer or zero value of n, and has a parent element. For values of a and b greater than zero, this effectively divides the element's children into groups of a elements (the last group taking the remainder), and selecting the bth element of each group. For example, this allows the selectors to address every other row in a table, and could be used to alternate the color of paragraph text in a cycle of four. The a and b values must be integers (positive, negative, or zero). The index of the first child of an element is 1.

- * In addition to this, :nth-child() can take odd and even as arguments instead. odd has the same signification as 2n+1, and even has the same signification as 2n.tr:nth-child(2n+1) finds every odd row of a table. :nth-child(10n-1) the 9th, 19th, 29th, etc, element. li:nth-child(5) the 5h li + * Additionally, :nth-child() supports odd and even as arguments. odd is the same as 2n+1, and even is the same as 2n.tr:nth-child(2n+1) finds every odd row of a table. :nth-child(10n-1) the 9th, 19th, 29th, etc, element. li:nth-child(5) the 5h li * :nth-last-child(an+b)elements that have an+b-1 siblings after it in the document tree. Otherwise like :nth-child()tr:nth-last-child(-n+2) the last two rows of a table * :nth-of-type(an+b)pseudo-class notation represents an element that has an+b-1 siblings with the same expanded element name before it in the document tree, for any zero or positive integer value of n, and has a parent elementimg:nth-of-type(2n+1) * :nth-last-of-type(an+b)pseudo-class notation represents an element that has an+b-1 siblings with the same expanded element name after it in the document tree, for any zero or positive integer value of n, and has a parent elementimg:nth-last-of-type(2n+1) @@ -79,7 +79,9 @@ *

A word on using regular expressions in these selectors: depending on the content of the regex, you will need to quote the pattern using Pattern.quote("regex") for it to parse correctly through both the selector parser and the regex parser. E.g. String query = "div:matches(" + Pattern.quote(regex) + ");".

*

Escaping special characters: to match a tag, ID, or other selector that does not follow the regular CSS syntax, the query must be escaped with the \ character. For example, to match by ID {@code

}, use {@code document.select("#i\\.d")}.

* - * @see Element#select(String) + * @see Element#select(String css) + * @see Elements#select(String css) + * @see Element#selectXpath(String xpath) */ public class Selector { // not instantiable