Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

CurlImpersonateHttpClient warning on Windows #495

Closed
janbuchar opened this issue Sep 3, 2024 · 3 comments · Fixed by #538
Closed

CurlImpersonateHttpClient warning on Windows #495

janbuchar opened this issue Sep 3, 2024 · 3 comments · Fixed by #538
Assignees
Labels
bug Something isn't working. t-tooling Issues with this label are in the ownership of the tooling team.

Comments

@janbuchar
Copy link
Collaborator

Using the CurlImpersonateHttpClient adds this warning message upon program start, which doesn't seem to be fixed if I add in the command that it asks for

    asyncio.set_event_loop_policy(WindowsSelectorEventLoopPolicy())
    await crawler.run(["https://www.mtggoldfish.com/metagame/modern#paper"])
.venv\Lib\site-packages\curl_cffi\aio.py:137: RuntimeWarning:
    Proactor event loop does not implement add_reader family of methods required.
    Registering an additional selector thread for add_reader support.
    To avoid this warning use:
        asyncio.set_event_loop_policy(WindowsSelectorEventLoopPolicy())

Originally posted by @MrTyton in #486 (comment)

@github-actions github-actions bot added the t-tooling Issues with this label are in the ownership of the tooling team. label Sep 3, 2024
@janbuchar janbuchar added the bug Something isn't working. label Sep 3, 2024
@janbuchar janbuchar self-assigned this Nov 4, 2024
@janbuchar
Copy link
Collaborator Author

@MrTyton Looks like you have to add asyncio.set_event_loop_policy(WindowsSelectorEventLoopPolicy()) before asyncio.run(), i.e., before your event loop is created.

@vdusek
Copy link
Collaborator

vdusek commented Dec 2, 2024

I believe we should document this somewhere, follow-up issue is ok

@janbuchar
Copy link
Collaborator Author

I believe we should document this somewhere, follow-up issue is ok

I'm not so sure about that, it's a limitation of a package that we integrate with and the warning tells you what to do pretty well. Maybe we can add a line about this into the docblock of CurlImpersonateHttpClient.

Mantisus pushed a commit to Mantisus/crawlee-python that referenced this issue Dec 10, 2024
This adds a unified `crawlee/project_template` template. The original
`playwright` and `beautifulsoup` templates are kept for compatibility
with older versions of the CLI.

The user is now prompted for package manager type (pip, poetry), crawler
type, start URL and whether or not Apify integration should be set up.

- closes apify#317
- closes apify#414 (http client selection is not implemented)
- closes apify#511
- closes apify#495

### TODO

- [x] http client selection
- [x] disable poetry option if it isn't installed
- [x] rectify the pip-based setup
1. **manual dependency installation** - no automatic installation, just
dump requirements.txt and tell the user to handle it any way they want
2. **pip+venv** - dump requirements.txt, make a virtualenv (.venv) using
the current python interpreter, install requirements and tell user to
activate it
- ~should be disabled if `venv` module is not present~ it's stdlib
- [x] test the whole thing on Windows (mainly the various package
manager configurations)
- [x] fix how cookiecutter.json is read (it is not present when
installing via pip)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working. t-tooling Issues with this label are in the ownership of the tooling team.
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants