-
Notifications
You must be signed in to change notification settings - Fork 2
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[DRAFT] Feature/merge changes from upstream #3
base: master
Are you sure you want to change the base?
Conversation
Batch GPU decoding
Still running some tests with the new package and a running engine, but so far I can see that the running time of concurrent requests is pretty stable. Submitting 4 requests over and over again at the same time leads to total request times averaging about 45 seconds. The important detail here is that I am not seeing any request times taking roughly double that, as I was with the old package. I also checked to see if the outputs being generated were still different (the non-deterministic issue). It appears that that has not been resolved. But, I wasn't really expecting that to be resolved with these changes, anyway. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for putting this together @mulhod! I was looking through the changes that are in here and it looks like the major change is now batch decoding that also seems to support the GPU which is cool! However, I couldn’t really find any changes that may be related directly to thread safety. Can you point me to where those are?
@desilinguist Actually, I don't either. I was assuming that certain changes were already merged. Maybe they haven't yet. My first thought was to just merge in all changes since this branched off. Anyway, I am now seeing a new pattern emerge that was unexpected and I now believe was making it seem as though the issue were resolved when it actually had not been. Using the original package, I first tried to submit 1 request at a time just to make sure things were working. The requests were taking a stable amount of time and that's all that I noted. Then, when I submitted 3 requests at a time, I noted that certain requests were taking considerably longer than others. It looked like the same pattern from the engine context. However, the single requests were only taking about 30 seconds and the simultaneous requests were ranging from 30 seconds to a minute. When I made the new package with the changes in this PR, I tried 4 requests simultaneously and noted that they were all taking about 55 seconds. So, I thought this was resolved because the timing was stable. However, I went back and tried with the original Long story short, I don't believe this is resolving anything unfortunately. I will take a look at the PR that I thought was merged that related to thread safety issues. In the meantime, though, it could still be a good idea to update the package. It wouldn't do any harm, at least... |
Thanks for the detailed update and for confirming my intuition @mulhod. It definitely makes sense to update the package with the latest upstream code! From what I remember, I think the thread-safety changes might have been in a fork ... may be we can incorporate them in our branch too? |
Yeah, I'm going to look into that shortly. I'll just leave this open/as a draft for now and work on it when I get a chance. |
In this branch (which is a branch from the current
master
branch in this fork), I merged changes from the upstreammaster
branch and also incremented the version of the Python bindings to0.3.42
. Check out the 3 commits at the very bottom to see the extra changes I put in. The merged-in changes help with the thread safety issue and I am able to get similar times for each request.I will make this a draft PR for now until I can do some more testing with a running Sparkle engine. Thus far, I have only been testing this outside the engine using a custom-made script that runs the
calculate_feature
function concurrently. I can share this code if it will be interesting. With this script, I was able to reproduce the pattern we were seeing in the engine (some requests taking double the time). With my updates, concurrent executions are taking roughly the same time.