Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Docker support #20

Closed
slotix opened this issue May 6, 2019 · 10 comments
Closed

Docker support #20

slotix opened this issue May 6, 2019 · 10 comments

Comments

@slotix
Copy link

slotix commented May 6, 2019

Awsome module!
Do you plan to build "se-scraper" docker image?
Thank you.

@NikolaiT
Copy link
Owner

NikolaiT commented May 8, 2019

I dont have much experience with Docker, but I will look into it very soon and add such a Docker Image.

@slotix
Copy link
Author

slotix commented May 10, 2019

I'm going to add docker image myself and share with the community.

@snork-alt
Copy link

I'm getting this error while trying to launch se_scraper in docker

UnhandledPromiseRejectionWarning: Error: Unable to launch browser for worker, error message: Failed to launch chrome!
[0514/035629.769555:ERROR:zygote_host_impl_linux.cc(89)] Running as root without --no-sandbox is not supported. See https://crbug.com/638180.

Are you facing the same issue ?

@kederrac
Copy link

kederrac commented May 14, 2019

use this config or the one from commit

 let config = {
            random_user_agent: true,
            write_meta_data: true,
            sleep_range: "",
            chrome_flags:[ ],
            search_engine: searchEngine,
            debug: false,
            verbose: false,
            keywords: keys,
            num_pages: num_pages,
            headless: true,
            puppeteer_cluster_config:{
                    timeout:600000,
                    monitor:false,
                    concurrency:1,
                    maxConcurrency:1
                }
        };

@slotix
Copy link
Author

slotix commented May 14, 2019

Try to add "--no-sandbox" parameter to "chrome_flags"

curl -XPOST http://0.0.0.0:3000 -H 'Content-Type: application/json' \
-d '{
    "user_agent":"Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/70.0.3538.110 Safari/537.36",
    "random_user_agent":true,
    "sleep_range":"",
"search_engine":"baidu",
    "debug":true,
    "verbose":true,
    "keywords":[ "cat",  "mouse" ],
    "keyword_file":"",
    "num_pages":1,
    "headless":true,
    "chrome_flags":["--no-sandbox" ],
    "output_file":"examples/results/baidu.json",
    "block_assets":false,
    "custom_func":"",
    "proxy":"",
    "proxy_file":"",
    "test_evasion":false,
    "apply_evasion_techniques":true,
    "log_ip_address":false,
    "log_http_headers":false,
    "puppeteer_cluster_config":{
        "timeout":600000,
        "monitor":false,
        "concurrency":1,
        "maxConcurrency":1
    }
}'

@ghost
Copy link

ghost commented Jul 26, 2019

i take the image from docker-hub. the port of docker image (on localhost ip same as the direct container ip) is closed. with nmap scan same result.

@slotix
Copy link
Author

slotix commented Jul 26, 2019

pull the latest version from docker hub
run docker run -it -e HOST=0.0.0.0 -e PORT=3000 -p 3000:3000 slotix/se-scraper
try another port instead of 3000

@ghost
Copy link

ghost commented Jul 26, 2019

thx @slotix for the fast help. new image runs and i get a request.
if the env's HOST and PORT necassary, you should edit the pull request.

thx a lot

@slotix
Copy link
Author

slotix commented Jul 26, 2019

@axel-g updated pull request

@tobiasmuehl
Copy link

Can we close this?

@slotix slotix closed this as completed Aug 25, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants