Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[ERROR] [corpora_requests:whether_result_found] The request is not correct #60

Open
TimurSamigulin opened this issue Mar 28, 2023 · 3 comments
Assignees
Labels
bug Something isn't working

Comments

@TimurSamigulin
Copy link

bug
[ERROR] [corpora_requests:whether_result_found] The request is not correct

code

    import rnc
    
    ru = rnc.MainCorpus(
        query='корпус', 
        p_count=5,
        marker=str.upper)
    
    ru.request_examples()

logs

    [28.03.2023 15:24:07,632] [DEBUG] [corpora_requests:is_request_correct] Validating that everything is OK
    [28.03.2023 15:24:07,634] [DEBUG] [corpora_requests:whether_result_found] Validating that the request is OK
    [28.03.2023 15:24:07,635] [INFO] [corpora_requests:get_htmls] Requested to 'https://processing.ruscorpora.ru/search.xml' [0;1) with params {'env': 'alpha', 'api': '1.0', 'lang': 'en', 'dpp': 5, 'spd': 10, 'text': 'lexgramm', 'out': 'normal', 'sort': 'i_grtagging', 'nodia': 1, 'lex1': 'корпус', 'mode': 'main'}
    [28.03.2023 15:24:07,636] [ERROR] [corpora_requests:whether_result_found] The request is not correct: {'env': 'alpha', 'api': '1.0', 'lang': 'en', 'dpp': 5, 'spd': 10, 'text': 'lexgramm', 'out': 'normal', 'sort': 'i_grtagging', 'nodia': 1, 'lex1': 'корпус', 'mode': 'main'}
    [28.03.2023 15:24:07,636] [ERROR] [corpora_requests:is_request_correct] HTTP request is wrong
    [28.03.2023 15:24:07,637] [ERROR] [corpora:request_examples] Query = ['корпус'], 5, {'env': 'alpha', 'api': '1.0', 'lang': 'en', 'dpp': 5, 'spd': 10, 'text': 'lexgramm', 'out': 'normal', 'sort': 'i_grtagging', 'nodia': 1, 'lex1': 'корпус', 'mode': 'main'}
    e = {'env': 'alpha', 'api': '1.0', 'lang': 'en', 'dpp': 5, 'spd': 10, 'text': 'lexgramm', 'out': 'normal', 'sort': 'i_grtagging', 'nodia': 1, 'lex1': 'корпус', 'mode': 'main'}
    Output exceeds the [size limit](command:workbench.action.openSettings?[). Open the full output data [in a text editor](command:workbench.action.openLargeOutput?df6b87e8-4758-497a-a54d-7a456297814e)
    ---------------------------------------------------------------------------
    RuntimeError                              Traceback (most recent call last)
    File ~/.conda-envs/venv2/lib/python3.8/site-packages/rnc/corpora_requests.py:200, in whether_result_found(url, **kwargs)
        199 try:
    --> 200     page_html = get_htmls(url, **kwargs)[0]
        201 except Exception:
    
    File ~/.conda-envs/venv2/lib/python3.8/site-packages/rnc/corpora_requests.py:164, in get_htmls(url, start, stop, **kwargs)
        162 coro_start = time.time()
    --> 164 html_codes = asyncio.run(
        165     get_htmls_coro(url, start, stop, **kwargs)
        166 )
        168 logger.info("Request was successfully completed")
    
    File ~/.conda-envs/venv2/lib/python3.8/asyncio/runners.py:33, in run(main, debug)
         32 if events._get_running_loop() is not None:
    ---> 33     raise RuntimeError(
         34         "asyncio.run() cannot be called from a running event loop")
         36 if not coroutines.iscoroutine(main):
    
    RuntimeError: asyncio.run() cannot be called from a running event loop
    
    During handling of the above exception, another exception occurred:
    
    RuntimeError                              Traceback (most recent call last)
    ...
    --> 294     raise WrongHTTPRequest(f"{kwargs}")
        295 logger.debug("HTTP request is correct, result found")
        297 logger.debug("Validating that the last page exists")
    
    WrongHTTPRequest: {'env': 'alpha', 'api': '1.0', 'lang': 'en', 'dpp': 5, 'spd': 10, 'text': 'lexgramm', 'out': 'normal', 'sort': 'i_grtagging', 'nodia': 1, 'lex1': 'корпус', 'mode': 'main'}

Я так понимаю с текущей версией сайта не библиотека не работает? Ошибка при обрашении к https://processing.ruscorpora.ru/search.xml

@TimurSamigulin TimurSamigulin added the bug Something isn't working label Mar 28, 2023
@kunansy
Copy link
Owner

kunansy commented Apr 3, 2023

Спасибо за обратную связь!

Действительно, был обновлён дизайна сайта и добавлен API на Django REST framework, из-за чего некоторые корпусы пока не поддерживаются библиотекой (параллельный ещё доступен).

Постараюсь скоро починить, когда время будет

@TimurSamigulin
Copy link
Author

Спасибо за ответ, теперь они кодируют обращение к api https://ruscorpora.ru/ru/api/search?query=KhgKCAgAEAoYMiAKEAUgACiWserh7KRFeAAyAggBOgEBQhMKEQoPCgNyZXESCAoG0LrQvtGC но как получить query параметр пока разобраться не смог

@kunansy
Copy link
Owner

kunansy commented Apr 3, 2023

@TimurSamigulin Сделано чужими не для людей, первый шаг -- экранирование для url, дальше, очевидно, есть шаг кодировки в base64 (уже тут разумен вопрос – а зачем?), потом ещё что-то.


Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

2 participants