context deadline exceeded in linux #771

Intpute · 2021-03-18T15:53:28Z

What versions are you running?

$ github.com/chromedp/chromedp v0.6.8
$ Google Chrome 89.0.4389.90 in google cloud, linux amd64
$ go 1.15

What did you do? Include clear steps.

Hi, i am getting "context deadline exceeded" error in linux os, i try to search online but no luck, no idea whether is OS problem or my context writing problem.
i have code as below, just getting the page content from some urls, the code running good in windows OS and i already used it for some time, right now i wish to run the executable file in google cloud linux os, but only first visit to the link will success, then encounter the deadline exceeded error.

var wg sync.WaitGroup

func main() {
       wg.Add(3)

	go func() {
		web1()
		wg.Done()
	}()

	go func() {
		web2()
		wg.Done()
	}()

       	go func() {
		simpleHttpGet()
		wg.Done()
	}()

        wg.Wait()
}

func web1() {
            for {
                 resp, err := GetHttpHtmlContent("link1", "xpath", "selector" )
                 fmt.Println(resp)
                 time.Sleep(50 * time.Second)
            }
}

func web2() {
             for {
                 resp, err := GetHttpHtmlContent("link2", "xpath", "selector" )
                 fmt.Println(resp)
                 time.Sleep(50 * time.Second)
            }
}

func simpleHttpGet() {
             for {
                 resp, err := http.Get(linkX)
                 fmt.Println(resp)
                 time.Sleep(50 * time.Second)
            }
}

func GetHttpHtmlContent(url string, selector string, sel interface{}) (string, error) {
	chromeCtx, chromeCancel := chromedp.NewContext(context.Background())
	defer chromeCancel()

	timeoutCtx, timeoutCancel := context.WithTimeout(chromeCtx, 30*time.Second)
	defer timeoutCancel()

	var htmlContent string

	err := chromedp.Run(timeoutCtx,
		chromedp.Navigate(url),
		chromedp.WaitVisible(selector),
		chromedp.OuterHTML(sel, &htmlContent, chromedp.BySearch),
	)
	if err != nil {
		fmt.Println(err)
		return "", err
	}
	return htmlContent, nil
}

i have extra scenarios as below:

in linux:
if only crawl single URL and has only single GetHttpHtmlContent function in goroutine, Work as expected.
once more than one, for example i copy the GetHttpHtmlContent function and paste as new function to use for difference links, the errors appear.
No issue with multiple simpleHttpGet functions in goroutine which use only http.get, work as expected.

in windows:
all above working.

What did you expect to see?

pages contents response as result.

What did you see instead?

only first visit to links success, then hangs forever with "context deadline exceeded"

ZekeLu · 2021-03-19T01:27:41Z

context deadline exceeded just means that chromedp.Run can not finish its job in the specified duration (30s, as specified in context.WithTimeout(chromeCtx, 30*time.Second)).

chromedp.Navigate(url) won't finish until all the resources on the page are loaded or timeout. Maybe it's very slow to access the page from google cloud?

chromedp.WaitVisible(selector) won't finish until the specified selector is found and visible on the page. Maybe the website blocks IPs owned by google cloud, and gives you a different page which doesn't contain the expected element (for example, #758).

You can create the context with logging enabled to check what it's waiting: chromeCtx, chromeCancel := chromedp.NewContext(context.Background(), chromedp.WithDebugf(log.Printf))

Intpute closed this as completed Mar 20, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

context deadline exceeded in linux #771

context deadline exceeded in linux #771

Intpute commented Mar 18, 2021

ZekeLu commented Mar 19, 2021

context deadline exceeded in linux #771

context deadline exceeded in linux #771

Comments

Intpute commented Mar 18, 2021

What versions are you running?

What did you do? Include clear steps.

What did you expect to see?

What did you see instead?

ZekeLu commented Mar 19, 2021