Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Set up dedicated devices / Tensorflow serving / Use an ExApp as backend #73

Open
arch-user-france1 opened this issue Aug 28, 2021 · 37 comments
Labels
enhancement New feature or request

Comments

@arch-user-france1
Copy link

Make an app that can be installed on other devices and a setting that if a device that has such app and is configured (and online) gets the model & pictures so it can process byself. My Server doesn't have a good GPU so I would like to run it on my fast er computer with a NVIDIA GeForce GTX 1660 super (or if a rasperry pi is online send it some data so it is a bit faster etc.)

@marcelklehr
Copy link
Member

This is a good idea, but it will take some time until I can get to it.

@marcelklehr marcelklehr added the enhancement New feature or request label Aug 28, 2021
@arch-user-france1
Copy link
Author

arch-user-france1 commented Aug 28, 2021

I love this project

It is very fun and the people begin clicking on cats (as this is the most accurate) and look how their cats looked past 2 years
When my cloud works well I possibly stop billing OneDrive and pay you 1$/month (Well I get 50CHF/Month)

@stavros-k
Copy link

One advantage is that you don't have to overload a nextcloud container with extra packages.
Also it can be scaled up a lot, like in a kubernetes cluster, you can spawn multiple workers to do the processing. (maybe not so much for a home lab, but in a bigger installations would be awesome).

@arch-user-france1
Copy link
Author

arch-user-france1 commented Sep 20, 2021

Maybe I can do a little beta

Maybe a bit of bash script that lets the classifier work through SSH (rsync)

@arch-user-france1
Copy link
Author

arch-user-france1 commented Oct 15, 2021

I'm actually thinking about it because I don't want to use a bash script
It sucks

How do you use the node/tensorflow? I see that you use - instead of a file path. Is it a huge performance decreasment to spawn a tensorflow for every file?

I should look into the code...

@marcelklehr
Copy link
Member

marcelklehr commented Oct 15, 2021

The classifier scripts accept input either via cli args or as JSON via stdin.

@marcelklehr
Copy link
Member

Ideally we would have a Tensorflow serving container and allow people to connect to it with recognize.

@marcelklehr marcelklehr changed the title Set up dedicated devices Set up dedicated devices / Tensorflow serving Oct 15, 2021
@UmbrellaCodr

This comment was marked as resolved.

@gotagitaccnt
Copy link

Any updates on this? Would really appreciate this feature to be implemented or maybe some pointers how to start

@guystreeter
Copy link

I can run the rest of Nextcloud on a 1GB VM.
A 4GB VM costs almost 4 times a much per month.
I would like to to batch-process recognition on another machine that I only spin up for a day once in a while. (Or do the recognition runs on my home machine with its NVIDIA GPU)

@ddarek2000
Copy link

Great Idea. Up-Vote

@szaimen
Copy link
Contributor

szaimen commented Jan 21, 2023

I think this would also be a good idea e.g. for AIO since libtensorflow does not seem to run in an alpine container but could then run from another container which could use debian as base.

@guystreeter
Copy link

Setting up GPU access for a container is complicated. Anyone running Nextcloud in a container would probably want to send image analysis requests to a service running on the native OS.

@marcelklehr
Copy link
Member

Setting up GPU access for a container is complicated.

It's not that hard, I believe

@guystreeter
Copy link

Setting up GPU access for a container is complicated.

It's not that hard, I believe

Have you tried it? The documented steps are expert admin level stuff

@relink2013
Copy link

I would love for this to become a reality. Im currently running NC in a Ubuntu VM for the sole purpose of using my Nvidia GPU with Recognize and Memories and I absolutely hate managing it.

@tbelway
Copy link

tbelway commented Mar 20, 2023

Oooh I would love this. Decentralized recognize service would allow for a greater deal of flexibility. I use a nextcloud container (linuxserver) that is alpine based, and recognize stopped working relatively recently due to changes in libtensorflow, prior I had it working with some container customization scripts... now that isn't working whihc is frustrating.

@marcelklehr I see that you are looking at tensorflow serving, does that mean you're thinking of having a nextcloud recognize-proxy app that would interact with this novel instance (probably container...)?

Setting up GPU access for a container is complicated.

It's not that hard, I believe

Have you tried it? The documented steps are expert admin level stuff

I don't know what you mean it's expert admin level stuff. It's pretty simple...
I've done this both through debian and RHEL based hypervisors (proxmox and kvm with cockpit) via both direct lxc or podman containerization as well as through hardware passthrough to a VM which is then containerized. As long as you are running linux, it's trivial.

@Leptopoda
Copy link
Member

@marcelklehr
I currently have some time to look into this and would love giving this a shot.
What is the blocker? Or how would you imagine this being implemented?

@arch-user-france1
Copy link
Author

arch-user-france1 commented Mar 24, 2023

@marcelklehr I currently have some time to look into this and would love giving this a shot. What is the blocker? Or how would you imagine this being implemented?

IMO, a nodejs socket server could be run on the dedicated device and then the images sent and tagged with an ID; preferably 128 to 512 at once, because that's the amount modern graphics cards handle with 100% utilisation and not something like 7% because the images are not supplied in-time or do not have enough pixels.
The dedicated device would send the result back with the appropriate ID, and done.

To set up a socket, socket.io or ws could be used.

It would also be very handy to have a configuration file. The easiest would be to write the variables exported into a javascript file and then import them in the actual program. Better, though, would be CSV files or something like them.

Remember tis a simple draft. Per'aps someone would like to implement it. There are many solutions to the problem, and it would always be great if we could get an anwer from marcelklehr so we know what he actually likes to have.
Finally, we could overcome the limitation of the nextcloud app repository hindering us to add back GPU support.

@Leptopoda
Copy link
Member

Why would a nodejs socket server be needed?
Tensorflow serving already has a restfull api that we could interact with directly. Also the latest version introduces a batch size option that sound like what you wanted.

I think you mean the training device (running TFserving) should fetch the jobs from the server but that's not how TFserving is meant to be used.

@pktiuk
Copy link

pktiuk commented Mar 24, 2023

@Leptopoda I fully agree with you.

BTW
In terms of asking about the way of implementing it, I think you should wait for feedback from maintainer of this repo (@marcelklehr), but he is on vacation right now, so it may take some time to get feedback from him.

@arch-user-france1
Copy link
Author

I meant not to use TFserving, but make it on my own. But in case you want to use it, yes, may be better.

@Zk2u
Copy link

Zk2u commented Dec 5, 2023

Any update on this?

@Tsaukpaetra
Copy link

(Commenting to add myself to notifications)
I am also interested in testing anything that can help this proceed. I recently updated from NC 23 (phew) and saw this nifty app. I enabled it, and then made a 😓 face when I realized the dinky little Celeron the server was running under would take years to process the current files (let alone any new ones that will ingest as part of the family archive project). Meanwhile my gaming PC is sitting idle and I'm left pondering how to bridge the gap of immense power for the times it is needed on demand. :)

@RudolfAchter
Copy link

Same situation here. Looks to me like a huge pain to get tensorflow running in the nextloud-aio alpine image. For me there also would be the benefit of using the power of my gamin pc.
Thinking of enterprises you could offload the gpu load to different GPU Servers or even a GPU Server cluster for analyzing stuff.

@marcelklehr
Copy link
Member

Nextcloud GmbH is planning to move the classifiers in recognize to docker containers as part of the External Apps Ecosystem in the coming months

@szaimen
Copy link
Contributor

szaimen commented Jan 14, 2024

Nextcloud GmbH is planning to move the classifiers in recognize to docker containers as part of the External Apps Ecosystem in the coming months

Sounds great! As soon as it is available via the External Apps Ecosystem, it will also automatically be available in AIO after one enables the docker socket proxy in the AIO interface and installs the app from the Nextcloud apps page :)

@marcelklehr
Copy link
Member

Nextcloud GmbH is planning to move the classifiers in recognize to docker containers as part of the External Apps Ecosystem in the coming months

Sorry to say, the plans have been scrapped due to lack of engineering time so far. It's still on our list of things that would be nice to have, but it's not scheduled any time soon for now :/

As mentioned in #1061 I'd be open to community contributions on this.

My rough plan would be not to deviate too much from how the models are run right now. Instead of the Classifier class executing node.js directly, there would be an option in the settings to call out to the recognize External App instead, or perhaps the external app could be auto-detected. The external app would do the same thing as the Classifier class, execute node.js and return the json line results, so they can be processed in the original recognize app. These are the current docs on how App API / External Apps work: cloud-py-api.github.io/app_api/index.html

@marcelklehr
Copy link
Member

Note, that you do have a little influence over what Nextcloud GmbH works on: We don't promise anything, but every release cycle we try to work on enhancements that get a lot of upvotes, so you may express your support for this by giving this issue an upvote.

@bugsyb

This comment was marked as spam.

@marcelklehr

This comment was marked as resolved.

@marcelklehr marcelklehr changed the title Set up dedicated devices / Tensorflow serving Set up dedicated devices / Tensorflow serving / Use an ExApp as backend Jun 26, 2024
@priyankub
Copy link

priyankub commented Nov 28, 2024

Why do we not have a debian/non-alpine image based AIO container for more capable systems?

I launched one locally for my system, but before I spend more time, want to check if there is even an appetite to launch something for users with more capable systems

@Zoey2936
Copy link

Why do we not have a debian/non-alpine image based AIO container for more capable systems?

I launched one locally for my system, but before I spend more time, want to check if there is even an appetite to launch something for users with more capable systems

see nextcloud/all-in-one#3382

@priyankub
Copy link

Thank you! So likely there's no appetite to move to Debian based images anytime soon.
As such, expanding recognize to be able to use external container seems to be the only other viable option.

Did I understand it right?

@priyankub
Copy link

priyankub commented Dec 9, 2024

What base image would the team prefer for a container that can use GPU to run the classifiers?

Context - I am currently working and testing out this feature

@szaimen
Copy link
Contributor

szaimen commented Dec 9, 2024

What base image would the team prefer for a container that can use GPU to run the classifiers?

I'd say debian:stable. WDYT @marcelklehr?

@marcelklehr
Copy link
Member

We've been using nvidia/cuda:12.2.2-cudnn8-devel-ubuntu22.04 for our AI ex apps so far.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
Status: No status
Development

No branches or pull requests