2captcha · poplers24 · Oct 21, 2024 · Oct 21, 2024 · Oct 21, 2024
diff --git a/.gitignore b/.gitignore
@@ -129,4 +129,5 @@ dmypy.json
 .pyre/
 
 .idea
+.DS_Store
 proxies_extension.zip
diff --git a/README.md b/README.md
@@ -4,10 +4,19 @@ This project automates solving Google reCAPTCHA v2 with image challenges (3x3 an
 
 ## Features
 
-- Uses **Selenium WebDriver** to interact with the browser and manipulate elements on the reCAPTCHA page.
-- **2Captcha API** helps solve image-based captchas using artificial intelligence.
+- **Selenium WebDriver**: Interacts with the browser and manipulates elements on the reCAPTCHA page.
+- **2Captcha API**: Solves image-based captchas using artificial intelligence.
 - Handles both **3x3** and **4x4** captchas with custom logic for each.
-- Tracks image updates and handles captcha error messages efficiently.
+- Modular design with separated logic into helper classes for easy code maintenance and future expansion.
+- Tracks image updates and handles captcha error messages efficiently using custom error handling.
+
+## Code Structure
+
+The project is structured as follows:
+
+- **`utils/actions.py`**: Contains the `PageActions` class, which encapsulates common browser actions (clicking, switching frames, etc.).
+- **`utils/helpers.py`**: Contains the `CaptchaHelper` class, responsible for solving captchas, executing JS, and handling captcha error messages.
+- **`js_scripts/`**: JavaScript files that extract captcha data and track image updates.
 
 ## Usage
 
@@ -43,16 +52,28 @@ python solve_recaptcha.py
 
 ## How It Works
 
-1. Browser Initialization: A browser is opened using Selenium WebDriver.
-2. Captcha Data Retrieval: JavaScript extracts the image tiles from reCAPTCHA and sends them to the 2Captcha service for solving.
-3. Captcha Submission: Once a solution is received from 2Captcha, Selenium simulates clicking on the correct image tiles based on the solution.
-4. Captcha Submission: The solution is submitted once the captcha is solved.
+1. **Browser Initialization:** A browser is opened using Selenium WebDriver.
+2. **Captcha Data Retrieval:** JavaScript extracts the image tiles from reCAPTCHA and sends them to the 2Captcha service for solving.
+3. **Captcha Submission:** Once a solution is received from 2Captcha, Selenium simulates clicking on the correct image tiles based on the solution.
+4. **Final Submission:** The solution is submitted once the captcha is solved.
 
 ## Captcha Solving Logic
 
-- For 3x3 captchas, the previous captcha ID (previousID) is saved to speed up solving when images are updated.
-- For 4x4 captchas, no previousID is saved, and each solution is processed from scratch.
-- Error messages, such as “Please try again” are handled, and the solving process is retried if needed.
+- **3x3 Captchas:** Previous captcha ID (previousID) is saved to speed up solving when images are updated.
+- **4x4 Captchas:** No previousID is saved, and each solution is processed from scratch.
+- **Error Handling:** Messages like “Please try again” are handled, and the solving process is retried if needed.
+
+## Modular Design
+
+The project follows a modular design for better maintainability:
+
+- **PageActions Class:** Handles general browser interactions like switching to iframes, clicking elements, and returning focus to the main content.
+- **CaptchaHelper Class:** Encapsulates captcha-specific logic, such as solving the captcha via 2Captcha API, handling error messages, and executing JavaScript in the browser.
+
+## JavaScript Scripts
+
+- `get_captcha_data.js`: Extracts captcha image tiles for solving. The source code of the script is located here https://gist.github.com/kratzky/20ea5f4f142cec8f1de748b3f3f84bfc
+- `track_image_updates.js`: Monitors requests to check if captcha images are updated.
 
 <!-- Shared links -->
 [2captcha-demo]: https://2captcha.com/demo

diff --git a/js_scripts/get_captcha_data.js b/js_scripts/get_captcha_data.js
@@ -0,0 +1,53 @@
+window.getCaptchaData = () => {
+    return new Promise((resolve, reject) => {
+        let canvas = document.createElement('canvas');
+        let ctx = canvas.getContext('2d');
+        let comment = document.querySelector('.rc-imageselect-desc-wrapper').innerText.replace(/\n/g, ' ');
+
+        let img4x4 = document.querySelector('img.rc-image-tile-44');
+        if (!img4x4) {
+            let table3x3 = document.querySelector('table.rc-imageselect-table-33 > tbody');
+            if (!table3x3) {
+                reject('Can not find reCAPTCHA elements');
+            }
+
+            let initial3x3img = table3x3.querySelector('img.rc-image-tile-33');
+
+            canvas.width = initial3x3img.naturalWidth;
+            canvas.height = initial3x3img.naturalHeight;
+            ctx.drawImage(initial3x3img, 0, 0);
+
+            let updatedTiles = document.querySelectorAll('img.rc-image-tile-11');
+
+            if (updatedTiles.length > 0) {
+                const pos = [
+                    { x: 0, y: 0 }, { x: ctx.canvas.width / 3, y: 0 }, { x: ctx.canvas.width / 3 * 2, y: 0 },
+                    { x: 0, y: ctx.canvas.height / 3 }, { x: ctx.canvas.width / 3, y: ctx.canvas.height / 3 }, { x: ctx.canvas.width / 3 * 2, y: ctx.canvas.height / 3 },
+                    { x: 0, y: ctx.canvas.height / 3 * 2 }, { x: ctx.canvas.width / 3, y: ctx.canvas.height / 3 * 2 }, { x: ctx.canvas.width / 3 * 2, y: ctx.canvas.height / 3 * 2 }
+                ];
+                updatedTiles.forEach((t) => {
+                    const ind = t.parentElement.parentElement.parentElement.tabIndex - 3;
+                    ctx.drawImage(t, pos[ind - 1].x, pos[ind - 1].y);
+                });
+            }
+            resolve({
+                rows: 3,
+                columns: 3,
+                type: 'GridTask',
+                comment,
+                body: canvas.toDataURL().replace(/^data:image\/?[A-z]*;base64,/, '')
+            });
+        } else {
+            canvas.width = img4x4.naturalWidth;
+            canvas.height = img4x4.naturalHeight;
+            ctx.drawImage(img4x4, 0, 0);
+            resolve({
+                rows: 4,
+                columns: 4,
+                comment,
+                body: canvas.toDataURL().replace(/^data:image\/?[A-z]*;base64,/, ''),
+                type: 'GridTask'
+            });
+        }
+    });
+};
diff --git a/js_scripts/track_image_updates.js b/js_scripts/track_image_updates.js
@@ -0,0 +1,22 @@
+window.monitorRequests = () => {
+    let found = false;
+
+    const observer = new PerformanceObserver((list) => {
+        const entries = list.getEntries();
+        entries.forEach((entry) => {
+            if (entry.initiatorType === 'xmlhttprequest' || entry.initiatorType === 'fetch') {
+                const url = new URL(entry.name);
+                if (url.href.includes("recaptcha/api2/replaceimage")) {
+                    found = true;  // If the request is found, set the flag to true
+                }
+            }
+        });
+    });
+
+    observer.observe({ entryTypes: ['resource'] });
+
+    // We return the result after 10 seconds
+    return new Promise((resolve) => {
+        setTimeout(() => resolve(found), 10000);
+    });
+};
diff --git a/main.py b/main.py
@@ -0,0 +1,139 @@
+import time
+from selenium import webdriver
+import os
+from twocaptcha import TwoCaptcha
+from utils.actions import PageActions
+from utils.helpers import CaptchaHelper
+
+# CONFIGURATION
+url = "https://2captcha.com/demo/recaptcha-v2"
+apikey = os.getenv('APIKEY_2CAPTCHA')  # Get the API key for the 2Captcha service from environment variables
+solver = TwoCaptcha(apikey)
+
+# LOCATORS
+l_iframe_captcha = "//iframe[@title='reCAPTCHA']"
+l_checkbox_captcha = "//span[@role='checkbox']"
+l_popup_captcha = "//iframe[contains(@title, 'two minutes')]"
+l_verify_button = "//button[@id='recaptcha-verify-button']"
+l_submit_button_captcha = "//button[@type='submit']"
+l_try_again = "//div[@class='rc-imageselect-incorrect-response']"
+l_select_more = "//div[@class='rc-imageselect-error-select-more']"
+l_dynamic_more = "//div[@class='rc-imageselect-error-dynamic-more']"
+l_select_something = "//div[@class='rc-imageselect-error-select-something']"
+
+# MAIN LOGIC
+options = webdriver.ChromeOptions()
+options.add_experimental_option('prefs', {'intl.accept_languages': 'en,en_US'})
+
+with webdriver.Chrome(options=options) as browser:
+    browser.get(url)
+    print("Started")
+
+    # Instantiate helper classes
+    page_actions = PageActions(browser)
+    captcha_helper = CaptchaHelper(browser, solver)
+
+    # We start by clicking on the captcha checkbox
+    page_actions.switch_to_iframe(l_iframe_captcha)
+    page_actions.click_checkbox(l_checkbox_captcha)
+    page_actions.switch_to_default_content()
+    page_actions.switch_to_iframe(l_popup_captcha)
+    time.sleep(1)
+
+    # Load JS files
+    script_get_data_captcha = captcha_helper.load_js_script('js_scripts/get_captcha_data.js')
+    script_change_tracking = captcha_helper.load_js_script('js_scripts/track_image_updates.js')
+
+    # Inject JS once
+    captcha_helper.execute_js(script_get_data_captcha)
+    captcha_helper.execute_js(script_change_tracking)
+
+    id = None  # Initialize the id variable for captcha
+
+    while True:
+        # Get captcha data by calling the JS function directly
+        captcha_data = browser.execute_script("return getCaptchaData();")
+
+        # Forming parameters for solving captcha
+        params = {
+            "method": "base64",
+            "img_type": "recaptcha",
+            "recaptcha": 1,
+            "cols": captcha_data['columns'],
+            "rows": captcha_data['rows'],
+            "textinstructions": captcha_data['comment'],
+            "lang": "en",
+            "can_no_answer": 1
+        }
+
+        # If the 3x3 captcha is an id, add previousID to the parameters
+        if params['cols'] == 3 and id:
+            params["previousID"] = id
+
+        print("Params before solving captcha:", params)
+
+        # Send captcha for solution
+        result = captcha_helper.solver_captcha(file=captcha_data['body'], **params)
+
+        if result is None:
+            print("Captcha solving failed or timed out. Stopping the process.")
+            break
+
+        # Check if the captcha was solved successfully
+        elif result and 'no_matching_images' not in result['code']:
+            # We save the id only on the first successful iteration for 3x3 captcha
+            if id is None and params['cols'] == 3 and result['captchaId']:
+                id = result['captchaId']  # Save id for subsequent iterations
+
+            answer = result['code']
+            number_list = captcha_helper.pars_answer(answer)
+
+            # Processing for 3x3
+            if params['cols'] == 3:
+                # Click on the answers found
+                page_actions.clicks(number_list)
+
+                # Check if the images have been updated
+                image_update = page_actions.check_for_image_updates()
+
+                if image_update:
+                    # If the images have been updated, continue with the saved id
+                    print(f"Images updated, continuing with previousID: {id}")
+                    continue  # Continue the loop
+
+                # Press the check button after clicks
+                page_actions.click_check_button(l_verify_button)
+
+            # Processing for 4x4
+            elif params['cols'] == 4:
+                # Click on the answers found and immediately press the check button
+                page_actions.clicks(number_list)
+                page_actions.click_check_button(l_verify_button)
+
+                # After clicking, we check for errors and image updates
+                image_update = page_actions.check_for_image_updates()
+
+                if image_update:
+                    print(f"Images updated, continuing without previousID")
+                    continue  # Continue the loop
+
+            # If the images are not updated, check the error messages
+            if captcha_helper.handle_error_messages(l_try_again, l_select_more, l_dynamic_more, l_select_something):
+                continue  # If an error is visible, restart the loop
+
+            # If there are no errors, send the captcha
+            page_actions.switch_to_default_content()
+            page_actions.click_check_button(l_submit_button_captcha)
+            break  # Exit the loop if the captcha is solved
+
+        elif 'no_matching_images' in result['code']:
+            # If the captcha returned the code "no_matching_images", check the errors
+            page_actions.click_check_button(l_verify_button)
+            if captcha_helper.handle_error_messages(l_try_again, l_select_more, l_dynamic_more, l_select_something):
+                continue  # Restart the loop if an error is visible
+            else:
+                page_actions.switch_to_default_content()
+                page_actions.click_check_button(l_submit_button_captcha)
+                break  # Exit loop
+
+    time.sleep(10)
-Original file line number
+Diff line change
@@ Expand Up / @@ -129,4 +129,5 @@ dmypy.json @@
     .pyre/
     .idea
+    .DS_Store
     proxies_extension.zip