Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

How to trigger a website's favicon.ico request in chrome Headless mode #680

Closed
Fly-Playgroud opened this issue Aug 10, 2022 · 11 comments
Closed
Assignees
Labels
enhance New feature or request question Questions related to rod

Comments

@Fly-Playgroud
Copy link
Contributor

Fly-Playgroud commented Aug 10, 2022

Rod Version: v0.108.2
In chrome Headless mode, the website's favicon.ico request is not triggered, resulting in a missing favicon.ico.

@Fly-Playgroud Fly-Playgroud added the question Questions related to rod label Aug 10, 2022
@ysmood
Copy link
Member

ysmood commented Aug 10, 2022

How about use golang http lib to simulate a favicon request after the page navigation?

@Fly-Playgroud
Copy link
Contributor Author

Fly-Playgroud commented Aug 10, 2022

How about use golang http lib to simulate a favicon request after the page navigation?

@ysmood
This is not elegant enough in my opinion, and this is a solution I thought of before. I think since we are already using Headless chrome, it is better to just get the traffic of favicon.ico without requesting the favicon.ico url again.

@Fly-Playgroud
Copy link
Contributor Author

Now I have a more elegant solution, but I also need an API to determine if chrome has headless mode on. In other words, I'd like to ask if there is an API to determine if chrome is in headless mode.

How about use golang http lib to simulate a favicon request after the page navigation?

@ysmood

@ysmood
Copy link
Member

ysmood commented Aug 14, 2022

package main

import (
	"fmt"
	"strings"

	"github.com/go-rod/rod"
	"github.com/go-rod/rod/lib/proto"
	"github.com/go-rod/rod/lib/utils"
)

func main() {
	browser := rod.New().MustConnect()

	res, err := proto.BrowserGetBrowserCommandLine{}.Call(browser)
	utils.E(err)

	for _, v := range res.Arguments {
		if strings.Contains(v, "headless") {
			fmt.Println("headless mode")
			return
		}
	}
	fmt.Println("headful mode")
}

@kurimi1
Copy link

kurimi1 commented Aug 17, 2022

Hi,Could you share the elegant solution?🙂

@Fly-Playgroud
Copy link
Contributor Author

Hi,Could you share the elegant solution?🙂

When my testing is stable, I'll propose a PR as an API for rod to incorporate

@kurimi1
Copy link

kurimi1 commented Aug 17, 2022

Hi,Could you share the elegant solution?🙂

When my testing is stable, I'll propose a PR as an API for rod to incorporate

Thanks, bro. I appreciate it.

@kurimi1
Copy link

kurimi1 commented Aug 18, 2022

There are my tow solution, but not working!

errRod = rod.Try(func() {
		ico := s.Page.Timeout(time.Second * 3).MustElement("link[rel=\"shortcut icon\"]")
		logrus.Info("ico html:", ico.MustHTML())
		logrus.Info("ico url:", ico.MustProperty("href").String())
		// ico not in resource
		bin, err := s.Page.GetResource(ico.MustProperty("href").String())
		if err != nil {
			logrus.Error("get ico error:", err)
		}
		scraped.Favicon = string(bin)

		// it not work too
		res, _ := proto.PageGetManifestIcons{}.Call(s.Page)
		scraped.Favicon = string(res.PrimaryIcon)
	})

@Fly-Playgroud
Copy link
Contributor Author

func (b *Browser) IsHeadless() bool {
	res, err := proto.BrowserGetBrowserCommandLine{}.Call(b)
	utils.E(err)
	for _, v := range res.Arguments {
		if strings.Contains(v, "headless") {
			return true
		}
	}
	return false
}

func (p *Page) TriggerFavicon() error {
	if !p.browser.IsHeadless() {
		return errors.New("Browser is headful")
	}
	wait := p.EachEvent(func(e *proto.PageFrameNavigated) bool {
		return e.Frame.ID == p.FrameID
	})
	js := `() => {
    const faviconElement = document.querySelector("link[rel~=icon]");
    const href = (faviconElement && faviconElement.href) || "/favicon.ico";
    const faviconUrl = new URL(href,window.location).toString();
    const xhr = new XMLHttpRequest();
    xhr.open("GET",faviconUrl);
    xhr.send();
    xhr.addEventListener("readystatechange",function () {
        if (xhr.readyState === 4) {
            if (xhr.status>=200 && xhr.status <= 300) {
                return true;
            }
        }
    });
}`
	_, err := p.Evaluate(Eval(js).ByUser())
	if err != nil {
		return err
	}
	wait()
	p.unsetJSCtxID()
	return nil

}

Here's my solution: the idea is, let's do what Chrome does for it, and inject this JS instead of headless Chrome rendering requests.
@ysmood @kurimi1

@Fly-Playgroud Fly-Playgroud added the enhance New feature or request label Aug 30, 2022
@Fly-Playgroud
Copy link
Contributor Author

Fly-Playgroud commented Aug 30, 2022

@ysmood Is it possible to combine these two APIs(TriggerFavicon() IsHeadless()) into rod master?

@ysmood
Copy link
Member

ysmood commented Aug 30, 2022

Sure, but you need to add the js to helper.js file, and use promise wrap it, so the caller knows when the ico is actually available for browser.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhance New feature or request question Questions related to rod
Projects
None yet
Development

No branches or pull requests

3 participants