Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

How to extend this project for selfie segmentation? #20

Closed
postacik opened this issue Oct 8, 2022 · 9 comments
Closed

How to extend this project for selfie segmentation? #20

postacik opened this issue Oct 8, 2022 · 9 comments

Comments

@postacik
Copy link

postacik commented Oct 8, 2022

Hi,
This repo is the only repo I could find to use mediapipe in a C++ application. Thanks for sharing it.

Can you please show the right way of adding a class for selfie segmentation and create a demo similar to the python example?

@postacik
Copy link
Author

I started adding a helper class for selfie segmentation:

#include "SelfieSegmentation.hpp"


my::SelfieSegmentation::SelfieSegmentation(std::string modelDir) :
    my::ModelLoader(modelDir + std::string("/selfie_segmentation.tflite")) 
{}


void my::SelfieSegmentation::loadImageToInput(const cv::Mat& in, int index) {
    ModelLoader::loadImageToInput(in);
}


void my::SelfieSegmentation::runInference() {
    ModelLoader::runInference();
}

However ModelLoader fails at allocateTensors() function with the following message:

ERROR: Encountered unresolved custom op: Convolution2DTransposeBias.
ERROR: Node number 244 (Convolution2DTransposeBias) failed to prepare.

ERROR: Failed to apply the default TensorFlow Lite delegate indexed at 0.
Failed to allocate tensors.

Have you ever encountered such an error?

@pntt3011
Copy link
Owner

Hi @postacik, I'm sorry that I hadn't noticed your issue until your latest response.
About the segmentation model, i am very busy recently and do not have time to look at the segmentation graph. My code is just about running the tflite model and the face detection graph is quite easy to implement.
About the error you encountered, it's because you are using the standard tflite, while mediapipe uses their custom tflite with some addtional ops (same with #12)

@postacik
Copy link
Author

postacik commented Oct 10, 2022

Hi @pntt3011, thank you for your reply.
I've just seen the resolver implementations in the mediapipe library and I'm trying to use them in the ModelLoader class for selfie segmentation. I'll report here if I can succeed.

@postacik
Copy link
Author

I succeeded to run allocateTensors() by adding the following resolver in buildInterpreter() function:

void my::ModelLoader::buildInterpreter(int numThreads) {
    tflite::ops::builtin::BuiltinOpResolver resolver;

    resolver.AddCustom("Convolution2DTransposeBias", mediapipe::tflite_operations::RegisterConvolution2DTransposeBias());

    if (tflite::InterpreterBuilder(*m_model, resolver)(&m_interpreter) != kTfLiteOk) {
        std::cerr << "Failed to build interpreter." << std::endl;
        std::exit(1);
    }
    m_interpreter->SetNumThreads(numThreads);
}

I copied RegisterConvolution2DTransposeBias() function from mediapipe source code.

However when I run ModelLoader::loadOutput() to get the output picture as mask (see picture below), the function returns a float array.

image

image

Your ModalLoader class has a loadImageToInput() function but no loadImageFromOutput() function.

How can I convert this float array to a matrix the same size of the input image?

@pntt3011
Copy link
Owner

@postacik I am very delightful to hear about your success. You can try resizing the output tensor to the same width and height of the input image.
As far as i know, Mask RCNN also predicts the mask with fixed size then resizes it to the input size. Some papers like PointRend improves the upsampling with some conv blocks.

@postacik
Copy link
Author

I think I should do the reverse of this function of your code:

cv::Mat my::ModelLoader::preprocessImage(const cv::Mat& in, int idx) const {
    auto out = convertToRGB(in);

    std::vector<int> inputShape = getInputShape(idx);
    int H = inputShape[1];
    int W = inputShape[2]; 

    cv::Size wantedSize = cv::Size(W, H);
    cv::resize(out, out, wantedSize);

    /*
    Equivalent to (out - mean)/ std
    */
    out.convertTo(out, CV_32FC3, 1 / INPUT_NORM_STD, -INPUT_NORM_MEAN / INPUT_NORM_STD);
    return out;
}

Am I on the right path?

@postacik
Copy link
Author

I wrote a helper function as below and it seems to work:

std::vector<float> my::SelfieSegmentation::getSegmentationMask() const {
    return ModelLoader::loadOutput(0);
}

cv::Mat my::SelfieSegmentation::loadOutputImage(int imageHeight, int imageWidth) const
{
    auto vec = getSegmentationMask();
    std::vector<int> outputShape = getOutputShape(0);
    int H = outputShape[1];
    int W = outputShape[2];
    cv::Mat out = cv::Mat(H, W, CV_32FC1);

    if (vec.size() == H * W * sizeof(float)) // check that the rows and cols match the size of your vector
    {
        // copy vector to mat
        memcpy(out.data, vec.data(), vec.size());
    }
    cv::Size wantedSize = cv::Size(imageWidth, imageHeight);
    cv::resize(out, out, wantedSize);
    return out;
}

image

I would appreciate your valuable comments.

@pntt3011
Copy link
Owner

Hi @postacik, I think the way you did is correct. Congratulations!

@postacik
Copy link
Author

Thanks for your help, closing...

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants