-
Notifications
You must be signed in to change notification settings - Fork 473
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[FastTokenizer] Support FastTokenizer on Android #783
Changes from 5 commits
5f25414
e34ce24
49cb7fa
aac361d
baf6dd7
3c9558a
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -53,7 +53,7 @@ static cv::Mat FastVisSegmentationNEON( | |
const uint8_t *im_ptr = static_cast<const uint8_t*>(im.data); | ||
|
||
if (!quantize_weight) { | ||
#pragma omp parallel for num_threads(2) schedule(static) | ||
#pragma omp parallel for proc_bind(close) num_threads(2) schedule(static) | ||
for (int i = 0; i < size - 15; i += 16) { | ||
uint8x16_t labelx16 = vld1q_u8(label_ptr + i); // 16 bytes | ||
// e.g 0b00000001 << 7 -> 0b10000000 128; | ||
|
@@ -87,7 +87,7 @@ static cv::Mat FastVisSegmentationNEON( | |
|
||
if (new_multi_factor == 8) { | ||
// Only keep mask, no need to blending with origin image. | ||
#pragma omp parallel for num_threads(2) schedule(static) | ||
#pragma omp parallel for proc_bind(close) num_threads(2) schedule(static) | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. num_thread = 2这个值可以写成一个常量,如 There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. 这里佳军之前的建议是暂时先不增加额外的接口给用户用,避免增加用户的学习成本。设置成2为默认值,是根据手机芯片大小核的设计来的,一般为1超+3大+4小或2超+2大+4小,跑在小核上会慢很多;在不同的机子上实验过,发现2线程加速比最优。 There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. done |
||
for (int i = 0; i < size - 15; i += 16) { | ||
uint8x16_t labelx16 = vld1q_u8(label_ptr + i); // 16 bytes | ||
// e.g 0b00000001 << 7 -> 0b10000000 128; | ||
|
@@ -112,7 +112,7 @@ static cv::Mat FastVisSegmentationNEON( | |
uint8x16_t old_mulx16 = vdupq_n_u8(old_multi_factor); | ||
uint8x16_t new_mulx16 = vdupq_n_u8(new_multi_factor); | ||
// Blend the two colors together with quantize 'weight'. | ||
#pragma omp parallel for num_threads(2) schedule(static) | ||
#pragma omp parallel for proc_bind(close) num_threads(2) schedule(static) | ||
for (int i = 0; i < size - 15; i += 16) { | ||
uint8x16x3_t bgrx16x3 = vld3q_u8(im_ptr + i * 3); // 48 bytes | ||
uint8x16_t labelx16 = vld1q_u8(label_ptr + i); // 16 bytes | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
这里不需要WITH_OMP判断吗?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
这里不需要,编译器会自动判断,因为没有用到omp.h头文件,所以暂时不需要WITH_OPENMP来判断