Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

context deadline exceeded in linux #771

Closed
Intpute opened this issue Mar 18, 2021 · 1 comment
Closed

context deadline exceeded in linux #771

Intpute opened this issue Mar 18, 2021 · 1 comment

Comments

@Intpute
Copy link

Intpute commented Mar 18, 2021

What versions are you running?

$ github.com/chromedp/chromedp v0.6.8
$ Google Chrome 89.0.4389.90 in google cloud, linux amd64
$ go 1.15

What did you do? Include clear steps.

Hi, i am getting "context deadline exceeded" error in linux os, i try to search online but no luck, no idea whether is OS problem or my context writing problem.
i have code as below, just getting the page content from some urls, the code running good in windows OS and i already used it for some time, right now i wish to run the executable file in google cloud linux os, but only first visit to the link will success, then encounter the deadline exceeded error.

var wg sync.WaitGroup

func main() {
       wg.Add(3)

	go func() {
		web1()
		wg.Done()
	}()

	go func() {
		web2()
		wg.Done()
	}()

       	go func() {
		simpleHttpGet()
		wg.Done()
	}()

        wg.Wait()
}

func web1() {
            for {
                 resp, err := GetHttpHtmlContent("link1", "xpath", "selector" )
                 fmt.Println(resp)
                 time.Sleep(50 * time.Second)
            }
}

func web2() {
             for {
                 resp, err := GetHttpHtmlContent("link2", "xpath", "selector" )
                 fmt.Println(resp)
                 time.Sleep(50 * time.Second)
            }
}

func simpleHttpGet() {
             for {
                 resp, err := http.Get(linkX)
                 fmt.Println(resp)
                 time.Sleep(50 * time.Second)
            }
}

func GetHttpHtmlContent(url string, selector string, sel interface{}) (string, error) {
	chromeCtx, chromeCancel := chromedp.NewContext(context.Background())
	defer chromeCancel()

	timeoutCtx, timeoutCancel := context.WithTimeout(chromeCtx, 30*time.Second)
	defer timeoutCancel()

	var htmlContent string

	err := chromedp.Run(timeoutCtx,
		chromedp.Navigate(url),
		chromedp.WaitVisible(selector),
		chromedp.OuterHTML(sel, &htmlContent, chromedp.BySearch),
	)
	if err != nil {
		fmt.Println(err)
		return "", err
	}
	return htmlContent, nil
}

i have extra scenarios as below:

in linux:
if only crawl single URL and has only single GetHttpHtmlContent function in goroutine, Work as expected.
once more than one, for example i copy the GetHttpHtmlContent function and paste as new function to use for difference links, the errors appear.
No issue with multiple simpleHttpGet functions in goroutine which use only http.get, work as expected.

in windows:
all above working.

What did you expect to see?

pages contents response as result.

What did you see instead?

only first visit to links success, then hangs forever with "context deadline exceeded"

@ZekeLu
Copy link
Member

ZekeLu commented Mar 19, 2021

context deadline exceeded just means that chromedp.Run can not finish its job in the specified duration (30s, as specified in context.WithTimeout(chromeCtx, 30*time.Second)).

chromedp.Navigate(url) won't finish until all the resources on the page are loaded or timeout. Maybe it's very slow to access the page from google cloud?

chromedp.WaitVisible(selector) won't finish until the specified selector is found and visible on the page. Maybe the website blocks IPs owned by google cloud, and gives you a different page which doesn't contain the expected element (for example, #758).

You can create the context with logging enabled to check what it's waiting: chromeCtx, chromeCancel := chromedp.NewContext(context.Background(), chromedp.WithDebugf(log.Printf))

@Intpute Intpute closed this as completed Mar 20, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants