-
Notifications
You must be signed in to change notification settings - Fork 34
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
SDXL sample app is broken on latest nightly #753
Comments
First guess is an iree runtime regression. We can probably avoid this for now by turning off async allocations, but ideally we fix or revert IREE before release as it will impact performance. |
Perhaps a separate issue -- I notice the SDXL test has been failing as shown:
This furthers my suspicion that there has been an IREE runtime regression. We had some significant changes land in IREE main that may need a bit of attention to smooth out downstream wrinkles. |
The double free should be fixed by iree-org/iree#19583 |
If you run with |
I do encounter a segmentation fault only when the workers are under load, using async allocations. If I switch on caching allocator (or switch off async allocations), the segfault does not occur. This is what is printed out at segfault with amd_log_level=1
|
Reopening as this should not have been closed. |
I think we can call this fixed? Anything left to follow up on? Maybe turning off the caching allocator, @monorimet ? |
Ran through the user guide as part of kicking off the release. Server app starts up normally however throws an error on any client request:
[2025-01-04 00:59:21.078] [error] [service.py:384] Fatal error in image generation
Traceback (most recent call last):
File "/home/sosa/3.11.venv/lib/python3.11/site-packages/shortfin_apps/sd/components/service.py", line 373, in run
await self._decode(device=device0, requests=self.exec_requests)
File "/home/sosa/3.11.venv/lib/python3.11/site-packages/shortfin_apps/sd/components/service.py", line 657, in _decode
(image,) = await fn(latents, fiber=self.fiber)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
RuntimeError: ValueError: shortfin_iree-src/runtime/src/iree/hal/drivers/hip/event_semaphore.c:359: ABORTED; while calling import; while invoking native function hal.device.queue.dealloca;
[ 0] bytecode compiled_vae.decode$async:484 genfiles/sdxl/stable_diffusion_xl_base_1_0_vae_bs1_1024x1024_fp16.mlir:142:3
[2025-01-04 00:59:21.079] [info] [metrics.py:51] Completed inference process (batch size 1) in 1058ms
[2025-01-04 00:59:21] 127.0.0.1:39728 - "POST /generate HTTP/1.1" 200
I've tried to use different device ids with no avail as I thought maybe I was contending with other processes on the same machine. Llama sample app works fine on the same machine.
The text was updated successfully, but these errors were encountered: