-
Notifications
You must be signed in to change notification settings - Fork 428
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Android support #12
Comments
Hi I'm not good at Android dev, and my energy is also devoted to other projects, so I'd glad if someone could help : ) |
i will have a try, now i am do somethings about model |
I split the model into Running result under a Xiaomi 12S @ Snapdragon 8+ Gen 1 SoC:
The efforts I have tried before:
I observed the device status in profile mode and found that the CPU usage is only about 45%. I am not sure if NPU was used during inference. I'm still evaluating whether it's worth to keep going, or anyone could give any advice would be appreciated 😄
|
i've send you an email,pls check out if you have time. respect! |
look forward to your work, really want this on Android |
Maybe you can briefly introduce the required technology stack and technical route, so that others can participate 😄 |
@greyovo for the |
@mazzzystar Thanks for the advice. I actually did preprocess (i.e. resizing to 224px, center croping, normalizing..., like CLIP's |
@Young-Flash In fact, most of the work relies on the Collab script provided by the author @mazzzystar, and I just exported the two encoders to Pytorch-JIT models respectively instead of CoreML. Possible improvements in my opinion:
Perhaps the easiest way is to convert the model to be suitable for NNAPI in order to speed up executing encoders. I tried by following pytorch's official tutorial but failed. It seems to require an ARM64 processor PC. I'm not sure if I missed something. |
I think you may be able to do quantization when exporting pytorch model. pytorch/pytorch#76726 |
@mazzzystar Yes I also tried quantization but encountered several problems that I cannot solve, and hence the quantization was failed, not to mention the NNAPI convertion (which needs a quantized model). I may later share a jupyter notebook I used to see if anyone would help. |
An interesting thing is that I found some efforts on distilling the CLIP model:
At least they proved that knowledge distillation may be a feasible direction, but requires a notable effort to do so. |
@greyovo I'm not sure if distillation is a good idea compared to quantinazation. Even you don't quantize the model, using I might try to export a lightweight Android version if I have time in the future : ) Is ONNX okay? |
@mazzzystar You are right! There's another option - ONNX. It seems they have complete docs and demos. So yes, it's worth a try! Thanks :) |
@mazzzystar Hi, I didn't quite understand how things are with the android app. I would have taken up this project in my spare time and rewrote part of the logic on kotlin (KMP), but I would have needed the help of an AI specialist. Are there any developments in this direction? |
@stakancheck The original model ViT-B/32 was too large for Android devices (see the discussion above) and hence the speed of encoding images into embeddings was much slower than those on iOS. So we are dealing with the model to see if me or @mazzzystar might export a light-weight version of the model to speed up executing and reduce the size of the model. By the way, are u familiar with kotlin or jetpack compose? I am a starter in Android development (I used Flutter before) but I would love to help in building the app :) |
I have made some progress in quantinazation with onnx. Please check my repo CLIP-android-demo for detail :) |
Deploying using NCNN needs good cpp and JNI develop skills, which I am not familiar with... Sorry. |
haha it doesn't matter at all, just a suggestion. you have make a great job! |
@greyovo Thanks for your great work, would like to see an android app : ) |
I verifed the onnx quantized model, code is here, result on my local machine is as follows:
I think it is good to go. |
CLIP don't support chinese well, see here, and I test a same image with chinese input(["老虎", "猫", "狗", "熊"]) and English input(["a tiger", "a cat", "a dog", "a bear"]), the logits value are [[0.09097634 0.18403262 0.24364232 0.4813488 ]] and [[0.04703762 0.9391219 0.00990335 0.00393698]] respectively, the chinese test result isn't ideal. @mazzzystar How do you deal with chinese text input in I tried Chinese-CLIP and onnx quantized model today and got ideal result, code is here, result is as follows:
|
@Young-Flash |
I see, I found a demo which use chinese to query, I thought it was translating chinese into english, but I didn't find the relevant code here so I felt puzzled. Chinese-CLIP is a pre-trained model with MIT license, The above |
@Young-Flash That's excately what I mean, notice that the Chinese-CLIP's arch( |
yeah I have made it here. |
guys, any update for Android version? really want it. |
I am blocked by a weird onnxruntime issue, the text encoder run at android can get the same inference result as python while the vit image encoder can't. |
@Young-Flash Same here. 😢 But I found that, converting the And I also observed that the quantized model may yield this problem while the original model would not. By the way, I have done the basic indexing and querying feature, but I am working on UI issues. I might replace the model with Chinese-CLIP in the near future. |
@greyovo I tried Do you plan to share your code? I think Chinese-CLIP is good for try, which support chinese and english. Feel free to let me know if there's anything I can help with. |
I agreed.
I will try with ChineseCLIP. I need to apply for a Software Copyright Certificate (aka. 软件著作权) to get it on the app market, and then I'll make it open source.
Thanks in advance :) @Young-Flash |
I developed an android app named smartSearch already. You guys can try using it. |
Got here from my previous github issue there: microsoft/onnxruntime#16472, which linked this repo. |
tried and didn't work, crush in a few seconds after start building index |
Could you feedback some device info? Which brand, which version? |
OnePlus 11, ColorOS13, with 16GB memory, should be ok to run. |
hi, have you solve that issue? and do you use OpenAI CLIP or Chinese-CLIP? |
Yeah, I've been using Chinese-CLIP. |
Great news! The Android app (PicQuery) is now on Google Play for free, support both English and Chinese: https://play.google.com/store/apps/details?id=me.grey.picquery The source code will be public soon, as I need to clean up something :) |
@greyovo Great!Will update the Android code and app details in the README after your work is complete. :) |
@greyovo nice UI, thanks your great work. |
大佬,索引相册时,每次扫描大概八九百张图片的时候就会发生闪退 java.io.FileNotFoundException: Failed to create image decoder with message 'invalid input'Input contained an error. |
可以试试我的smartsearch,不过我是收费的。 |
@greyovo just go ahead, I am willing to help if needed. |
Hello, when indexing an album, there is a flicker every time around 800 or 900 images are scanned java.io.FileNotFoundException: Failed to create image decoder with message 'invalid input'Input contained an error. |
This comment was marked as off-topic.
This comment was marked as off-topic.
This comment was marked as off-topic.
This comment was marked as off-topic.
太神奇了!作为第一版,完成度竟然已经这么高!
谢谢! |
Great job at all! |
@Baiyssy 是的,忘了考虑这个问题,目前版本是无法自动更新索引的,也没什么办法重建索引……后续会解决这个问题 |
PicQuery is open-source now, see https://github.com/greyovo/PicQuery :) |
@greyovo Great! I've added your Android repository link in the README. |
Are there any plans to support Android devices in the future?
The text was updated successfully, but these errors were encountered: