Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ELU Layer not supported on Vulkan #4122

Closed
RaghavPrabhakar66 opened this issue Aug 8, 2022 · 1 comment
Closed

ELU Layer not supported on Vulkan #4122

RaghavPrabhakar66 opened this issue Aug 8, 2022 · 1 comment

Comments

@RaghavPrabhakar66
Copy link

error log | 日志或报错信息 | ログ

Layer wise time taken during inference.

context | 编译/运行环境 | バックグラウンド

By compiling the code with benchmark flag, we can observe that all layers are able complete their calculations in microseconds (us) except ELU layer which completes its calculations in miliseconds.

how to reproduce | 复现步骤 | 再現方法

Param File
Bin File

void benchmark_supercombo(const char* comment, const ncnn::Mat& _in1, const ncnn::Mat& _in2, const ncnn::Mat& _in3, const ncnn::Mat& _in4, const ncnn::Option& opt)
{
    ncnn::Mat in1 = _in1;
    ncnn::Mat in2 = _in2;
    ncnn::Mat in3 = _in3;
    ncnn::Mat in4 = _in4;
    in1.fill(0.01f);
    in2.fill(0.01f);
    in3.fill(0.01f);
    in4.fill(0.01f);

    g_blob_pool_allocator.clear();
    g_workspace_pool_allocator.clear();

#if NCNN_VULKAN
    if (opt.use_vulkan_compute)
    {
        g_blob_vkallocator->clear();
        g_staging_vkallocator->clear();
    }
#endif // NCNN_VULKAN

    ncnn::Net net;

    net.opt = opt;

#if NCNN_VULKAN
    if (net.opt.use_vulkan_compute)
    {
        net.set_vulkan_device(g_vkdev);
    }
#endif // NCNN_VULKAN

#ifdef __EMSCRIPTEN__
#define MODEL_DIR "/working/"
#else
#define MODEL_DIR ""
#endif

    char parampath[256];
    sprintf(parampath, MODEL_DIR "%s.param", comment);
    net.load_param(parampath);

    DataReaderFromEmpty dr;
    net.load_model(dr);

    const std::vector<const char*>& input_names = net.input_names();
    const std::vector<const char*>& output_names = net.output_names();

    if (g_enable_cooling_down)
    {
        // sleep 10 seconds for cooling down SOC  :(
#ifdef _WIN32
        Sleep(10 * 1000);
#elif defined(__unix__) || defined(__APPLE__)
        sleep(10);
#elif _POSIX_TIMERS
        struct timespec ts;
        ts.tv_sec = 10;
        ts.tv_nsec = 0;
        nanosleep(&ts, &ts);
#else
        // TODO How to handle it ?
#endif
    }

    ncnn::Mat out;

    // warm up
    for (int i = 0; i < g_warmup_loop_count; i++)
    {
        ncnn::Extractor ex = net.create_extractor();
        ex.input(input_names[0], in1);
        ex.input(input_names[0], in2);
        ex.input(input_names[0], in3);
        ex.input(input_names[0], in4);
        ex.extract(output_names[0], out);
    }

    double time_min = DBL_MAX;
    double time_max = -DBL_MAX;
    double time_avg = 0;

    for (int i = 0; i < g_loop_count; i++)
    {
        double start = ncnn::get_current_time();

        {
            ncnn::Extractor ex = net.create_extractor();
            ex.input(input_names[0], in1);
            ex.input(input_names[0], in2);
            ex.input(input_names[0], in3);
            ex.input(input_names[0], in4);
            ex.extract(output_names[0], out);
        }

        double end = ncnn::get_current_time();

        double time = end - start;

        time_min = std::min(time_min, time);
        time_max = std::max(time_max, time);
        time_avg += time;
    }

    time_avg /= g_loop_count;

    fprintf(stderr, "%20s  min = %7.2f  max = %7.2f  avg = %7.2f\n", comment, time_min, time_max, time_avg);
}

int main(int argc, char** argv)
{
	...

    benchmark_supercombo("supercombo", ncnn::Mat(128, 256, 12), ncnn::Mat(8, 1), ncnn::Mat(2, 1), ncnn::Mat(512, 1), opt);
}
@nihui
Copy link
Member

nihui commented Jan 29, 2023

implemented in #4280

@nihui nihui closed this as completed Jan 29, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants