-
Notifications
You must be signed in to change notification settings - Fork 997
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Documenting the selenium-wire
"Wire Mode" REPLACEMENT
#3247
Labels
documentation
enhancement
Making things better
UC Mode / CDP Mode
Undetected Chromedriver Mode / CDP Mode
Comments
mdmintz
added
documentation
UC Mode / CDP Mode
Undetected Chromedriver Mode / CDP Mode
labels
Nov 6, 2024
mdmintz
changed the title
Documenting the
Documenting the Nov 6, 2024
selenium-wire
"Wire Mode" **replacement**selenium-wire
"Wire Mode" REPLACEMENT
Here's a new example (SeleniumBase/examples/cdp_mode/raw_xhr_sb.py) for """CDP.network.ResponseReceived with CDP.network.ResourceType.XHR."""
import ast
import asyncio
import colorama
import mycdp
import sys
import time
from seleniumbase.undetected import cdp_driver
xhr_requests = []
last_xhr_request = None
c1 = colorama.Fore.BLUE + colorama.Back.LIGHTYELLOW_EX
c2 = colorama.Fore.BLUE + colorama.Back.LIGHTGREEN_EX
cr = colorama.Style.RESET_ALL
if "linux" in sys.platform:
c1 = c2 = cr = ""
def listenXHR(page):
async def handler(evt):
# Get AJAX requests
if evt.type_ is mycdp.network.ResourceType.XHR:
xhr_requests.append([evt.response.url, evt.request_id])
global last_xhr_request
last_xhr_request = time.time()
page.add_handler(mycdp.network.ResponseReceived, handler)
async def receiveXHR(page, requests):
responses = []
retries = 0
max_retries = 5
# Wait at least 2 seconds after last XHR request for more
while True:
if last_xhr_request is None or retries > max_retries:
break
if time.time() - last_xhr_request <= 2:
retries = retries + 1
time.sleep(2)
continue
else:
break
await page
# Loop through gathered requests and get response body
for request in requests:
try:
res = await page.send(mycdp.network.get_response_body(request[1]))
if res is None:
continue
responses.append({
"url": request[0],
"body": res[0],
"is_base64": res[1],
})
except Exception as e:
print("Error getting response:", e)
return responses
async def crawl():
driver = await cdp_driver.cdp_util.start_async()
tab = await driver.get("about:blank")
listenXHR(tab)
# Change url to something that makes ajax requests
tab = await driver.get("https://learn.microsoft.com/en-us/")
time.sleep(2)
for i in range(75):
await tab.scroll_down(3)
time.sleep(0.02)
xhr_responses = await receiveXHR(tab, xhr_requests)
for response in xhr_responses:
print(c1 + "*** ==> XHR Request URL <== ***" + cr)
print(f'{response["url"]}')
is_base64 = response["is_base64"]
b64_data = "Base64 encoded data"
try:
headers = ast.literal_eval(response["body"])["headers"]
print(c2 + "*** ==> XHR Response Headers <== ***" + cr)
print(headers if not is_base64 else b64_data)
except Exception:
response_body = response["body"]
print(c2 + "*** ==> XHR Response Body <== ***" + cr)
print(response_body if not is_base64 else b64_data)
if __name__ == "__main__":
print("================== Starting ==================")
loop = asyncio.new_event_loop()
loop.run_until_complete(crawl()) |
This comment was marked as off-topic.
This comment was marked as off-topic.
This comment was marked as off-topic.
This comment was marked as off-topic.
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Labels
documentation
enhancement
Making things better
UC Mode / CDP Mode
Undetected Chromedriver Mode / CDP Mode
Documenting the
selenium-wire
"Wire Mode" REPLACEMENT.As many of you know, there's a
selenium-wire
integration via SeleniumBase Wire Mode.There are two main issues with it:
Here's the good news:
selenium-wire
features are included in the new SeleniumBase CDP Mode, (a subset of UC Mode).Here's an example of that, (SeleniumBase/examples/cdp_mode/raw_res_sb.py), where network requests and responses are captured and displayed:
Usage is different from regular Selenium-Wire, but it can do all the same things (and more) with better flexibility/control.
Here's another example, (SeleniumBase/examples/cdp_mode/raw_req_sb.py), where specific requests were filtered out (intercepted and blocked), to prevent images from loading:
If people don't need the stealth features (or other improvements made to intercepting/handling network requests & responses), then they can continue using the existing Wire Mode as is. Or, if people want the upgrades, then they can use the new CDP Mode, (as shown above).
The text was updated successfully, but these errors were encountered: