Condition destroy error shouldn't be a fatal error

When we stop one be,  it always makes a fatal error:

`F0213 12:33:53.604131 117681 utils.cpp:1124] fail to destroy cond. err=Device or resource busy`

https://github.com/apache/incubator-doris/blob/fd492e3b6fd729e617536842ba4092911f8afae8/be/src/olap/utils.cpp#L133-L139

We all know that EBUSY means destroy the object referenced by cond while it is referenced by another thread.

It's a common fault in multi-threads, so we shouldn't make it fatal after one try.
How about make it fatal after several failure attempts? As follows.

```
#define PTHREAD_COND_DESTROY_WITH_LOG(condptr) \
    do {\
        int cond_ret = 0;\
        int try_time = 0;\
        while (0 != (cond_ret = pthread_cond_destroy(condptr))) {\
            if (try_time++ < 20) sleep(1); \
            else LOG(FATAL) << "fail to destroy cond. err=" << strerror(cond_ret); \
        }\
    } while (0)
```

My test result:
~~It will wait 10~15s when the be is idle.~~
----2020/02/26----
It's my misunderstanding of wait 10s. My stop ope is:
1. send SIGTERM
2. wait 10s, if process can't exit, send SIGKILL

So, if only SIGTERM sent, BE may take longer to destory itself.
The root cause is the thread pool management. So this issue should be closed.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Condition destroy error shouldn't be a fatal error #2893

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Condition destroy error shouldn't be a fatal error #2893

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions