Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

How can I run an xpath expression? #165

Closed
fracasula opened this issue Jun 12, 2017 · 6 comments
Closed

How can I run an xpath expression? #165

fracasula opened this issue Jun 12, 2017 · 6 comments

Comments

@fracasula
Copy link

Hi everyone,

I checked the docs here but it's not clear to me whether there's a way to make the class Runtime evaluate not just JavaScript expressions but also XPaths.

Has anyone tried?

@cyrus-and
Copy link
Owner

I think what you want is DOM.performSearch:

const CDP = require('chrome-remote-interface');

CDP(async (client) => {
    try {
        // extract and enable domains
        const {DOM, Page} = client;
        await DOM.enable();
        await Page.enable();
        // navigate the page
        await Page.navigate({url: 'http://example.com'});
        await Page.loadEventFired();
        // "It is important that client receives DOM events only for the nodes
        // that are known to the client."
        await DOM.getDocument();
        // perform the XPath search query
        const {searchId, resultCount} = await DOM.performSearch({
            query: '/html/body/div/*'
        });
        // fetch and display all the results
        const {nodeIds} = await DOM.getSearchResults({
            searchId,
            fromIndex: 0,
            toIndex: resultCount
        });
        console.log(nodeIds);
    } catch (err) {
        console.error(err);
    } finally {
        client.close();
    }
}).on('error', (err) => {
    console.error(err);
});

Otherwise you can always use Runtime.evaluate to inject a Document.evaluate call.

@kensoh
Copy link

kensoh commented Jul 3, 2017

Thanks Andrea @cyrus-and for creating chrome-remote-interface project and sharing with the community! I saw that Chromy project is also recently built on top of this project.

I was thinking to use chrome-remote-interface initially for my web automation project but ended up integrating directly to Chrome in order to avoid Node.js dependency for non-dev users.

@fracasula adding on, I'm using document.evaluate call Andrea mentioned -
Checking this to see if > 0 to check existence of element

document.evaluate(XPATH_SELECTOR,document,null,XPathResult.ORDERED_NODE_SNAPSHOT_TYPE,null).snapshotLength

Using this to get the first element that matches and do stuff

document.evaluate(XPATH_SELECTOR,document,null,XPathResult.ORDERED_NODE_SNAPSHOT_TYPE,null).snapshotItem(0)

@tsirolnik
Copy link

tsirolnik commented Mar 28, 2018

I would like to note that WITHOUT calling DOM.getDocument prior to performing the search results, the output of getSearchResults will be an array with zeros.

Code:

        // Removing this line will cause getSearchResults to return an array filled with zeros.
         await DOM.getDocument();
        let { searchId, resultCount } = await DOM.performSearch({ query: xpath });
        if (resultCount < 1) return null;
        let { nodeIds } = await tDOM.getSearchResults({
            searchId,
            fromIndex: 0,
            toIndex: resultCount
        });
        console.log(nodeIds);
        nodeIds.forEach(async id => {
            // This will fail without DOM.getDocument
            console.log(await DOM.getAttributes({ nodeId: id })); 
        })

Output without getDocument -
image

Output with getDocument -
image

@motyar
Copy link

motyar commented Feb 15, 2019

@tsirolnik How we can get innerHTML? or text

@cyrus-and
Copy link
Owner

@motyar DOM.getOuterHTML may be a possibility.

@motyar
Copy link

motyar commented Feb 15, 2019

I am trying to change this to fetch by xpath, current code works with CSS selectors.
https://github.com/t9tio/cloudquery/blob/master/app.js#L132

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

5 participants