-
Notifications
You must be signed in to change notification settings - Fork 1.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
queue-proxy is huge #9957
Comments
Goal: eliminate queue-proxy dependency on Two offenders:
Pulling in the above PR and commenting the kubeClient references in |
Goal: eliminate queue-proxy dependency on Two offenders:
Commenting out the stackdriver logic in these packages further reduces the queue-proxy binary size to |
In my build opencensus also seems to pull |
FWIW, that's likely included in my figures, unless it is transitively pulled in through other things. I'm measuring the overall binary size once a particular cut point in the dependency graph has been snipped. |
Shouldn't effect the code pages, but @tcnghia also found that if we use:
It drops things another ~10MB (on top of what I measured above). Seems like the lowest hanging fruit yet. https://blog.filippo.io/shrink-your-go-binaries-with-this-one-weird-trick/ |
@evankanderson do you want to link to or summarize your thoughts on timeline to cut this dependency? |
This issue is stale because it has been open for 90 days with no |
/remove-lifecycle stale |
This reminded me of reading https://www.cockroachlabs.com/blog/go-file-size/ I don't know how up to date it is, but maybe it would be reasonable to consider another language if we wanted to minimize the binary? For the stackdriver dependency, we're in the (somewhat slow) process of rolling out a queue-proxy that uses Otel to our fleet. I believe we also need to publish some docs on how to configure OTel in place of the old stackdriver support, I can try to get that done soon. |
/lifecycle frozen Yeah, I poked through that when I first opened this.
While I think minimizing the binary is a nice goal, I also think there are practical considerations. I'm reminded of Kubernetes debating a rewriting the Likewise here, I don't think the goal is to have an O(KB) QP, but it'd be nice if we could keep it away from O(100MB) 😉
I already added |
Gotcha 😃 That all makes sense to me!
Totally agree, after I wrote that I immediately thought: well, what language, then? 🚲 Maaaybe something like Rust, but that's a stretch for maintenance, though it does sound kind of fun as a proof of concept.
Will try to keep this updated with our progress, we're close enough that I wouldn't mind seeing the stackdriver deletion in 0.21 or 0.22, wdyt? At this point I think it's fair for /us/ to take on the pain of porting that patch forward if we absolutely must 🤕 |
Have you met @julz 😉
Defer to @evankanderson who's been tracking the broader effort most closely. |
I mean envoy is written in c++ and has survived ok so.. rust may not be so bad (and it has worked well for linkerd)! I started a repo at http://github.com/julz/roo-proxy a while ago to experiment a bit (without much progress yet, tho!). One thing I'd really like to try is to see how much of QP we could reimplement as a wasm plug-in for envoy-- that way we could reuse all of the great work and optimisations and features in envoy and (for people who already have mesh) avoid needing a second sidecar at all. Anyway, totally ack that there's a maintainability trade-off here, just this may be a case where it's warranted (empirically QP is very expensive for what it does), and rust is pretty ok. |
FWIW, just to throw my hat in here too (not the red one necessarily): I've been tinkering on a rust-based queue-proxy in my friday learning time as well, without a ton of progress either. Currently it serves mostly as a segway for me to actually learn the language, but the queue-proxy in itself is a rather complex piece of software by now, I must say 😂. It ain't trivially rewritten in a day or two. It'd definitely be interesting to see the diff in resource consumption and performance Maintainability trade-off... definitely! |
/assign /triage accepted It sounds like Google may be looking to sustain Stackdriver support for longer than expected... I'm going to try to switch |
We've dropped the stackdriver exporters here: knative/pkg#2173 This has dropped the queue proxy size (on my Mac) from 50MB to 30MB - so a diff of 20MB (~40%). There might be some additional gains to be made and I'll poke at this a bit more |
So there are still gains to be made by dropping Specifically the list is
I tested changes 1-3 and removed tracing and metrics in order to see what the floor of completing 4-6 would be - that resulted in a queue-proxy binary size of 15MB. There's probably a gain to be had when we removed it's copy of prometheus libs - ie. #11126 |
/unassign @mattmoor |
@julz |
/area API
/area autoscale
/area networking
What version of Knative?
HEAD
Expected Behavior
queue-proxy is trivially small.
Actual Behavior
As of this morning, queue-proxy is ~55.7MB.
Steps to Reproduce the Problem
Going to self-assign, but feel free to jump in and party on this with me.
/assign
The text was updated successfully, but these errors were encountered: