Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

mDNS [BUG] halt on semaphore if iot button solution is used and use in a GPIO2 that is pushed to GND on button creation (IDFGH-14209) #708

Open
3 tasks done
filzek opened this issue Dec 9, 2024 · 4 comments
Assignees
Labels
Status: Opened Issue is new

Comments

@filzek
Copy link

filzek commented Dec 9, 2024

Answers checklist.

  • I have read the documentation for esp-protocols components and the issue is not addressed there.
  • I have updated my esp-protocols branch (master or release) to the latest version and checked that the issue is present there.
  • I have searched the issue tracker for a similar issue and not found a similar issue.

General issue report

after searching to understand and debug we found out that using mDNS with Button if we use a GPIO 2 thread lock inside mDNS

New SDK 5.3.2 or older does a thread lock in the task while calling mdns_query_ptr, it just halt and lock it.

It happen if yhe Iot solution Button is being used as of espressif/button v3.4.0

it crashes the MDNS when the Button add the GPIO 2, so, if the GPIO 2 is used in button the MDNS jsut halt.

esp_err_t err2 = mdns_query_ptr(service_name, proto, 5000, 100, &results2);

so it keeps inside forever.we found out that the lock occrus on

esp_err_t mdns_query_generic(const char *name, const char *service, const char *proto, uint16_t type, mdns_query_transmission_type_t transmission_type, uint32_t timeout, size_t max_results, mdns_result_t **results)

at:
xSemaphoreTake(search->done_semaphore, portMAX_DELAY);

The behavior only happen if the GPIO2 is used to create the button

GPIO must be pushed to GND in the btn = iot_button_create(&btn_cfg); to create the bug

@espressif-bot espressif-bot added the Status: Opened Issue is new label Dec 9, 2024
@github-actions github-actions bot changed the title mDNS [BUG] halt on semaphore if iot button solution is used and use in a GPIO2 that is pushed to GND on button creation mDNS [BUG] halt on semaphore if iot button solution is used and use in a GPIO2 that is pushed to GND on button creation (IDFGH-14209) Dec 9, 2024
@david-cermak
Copy link
Collaborator

Hi @filzek

Are you calling the mdns_query_ptr(timeout=5s) inside the button callback? Please note that this API will block the even call for 5 seconds. Moreover mdns component itself uses esp_timer to process the queries, so while the timer task is blocked it cannot post the queries not process answers, thus blocking the system at once.
I'd suggest using async queries or posting synchronous queries outside the callback.

@filzek
Copy link
Author

filzek commented Dec 9, 2024

Hi @filzek

Are you calling the mdns_query_ptr(timeout=5s) inside the button callback? Please note that this API will block the even call for 5 seconds. Moreover mdns component itself uses esp_timer to process the queries, so while the timer task is blocked it cannot post the queries not process answers, thus blocking the system at once. I'd suggest using async queries or posting synchronous queries outside the callback.

no, I am calling from another task that has a priority 5 and mdns is priority 5.

What is odd is that if we diable the IO2 on button it works, so the button has 7 IOs setup, no matter if just 1 or the 7, if the IO2 is enable on it the crash happen.

We can try to move to async to do it, but we think that somehow the semaphore routine is being blocked in the reading while io2 is setup, but it only happen if the IO2 is setup and is withing GND on it, if it lets open the crash doenst happen.

@david-cermak
Copy link
Collaborator

Would you please share a simplified project or a reproducer, then? Calling mdns_query() from a separate task with the same prio as the mdns task works without an issue.
Also, you're talking about a crash, could you please share the logs, backtrace, type of the exception?

@david-cermak
Copy link
Collaborator

@filzek Any news about the reproducer? I tried checking myself and was able to recreated issues when querying directly in the callback (as mentioned above), but works as expected when doing this in a separate task.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Status: Opened Issue is new
Projects
None yet
Development

No branches or pull requests

3 participants