-
-
Notifications
You must be signed in to change notification settings - Fork 11k
Open
Labels
bugSomething isn't workingSomething isn't workingstaleOver 90 days of inactivityOver 90 days of inactivity
Description
Your current environment
The output of python collect_env.py
==============================
System Info
==============================
OS : Ubuntu 22.04.5 LTS (x86_64)
GCC version : (Ubuntu 11.4.0-1ubuntu1~22.04) 11.4.0
Clang version : Could not collect
CMake version : version 3.26.4
Libc version : glibc-2.35
==============================
PyTorch Info
==============================
PyTorch version : 2.7.0+git6fd4078
Is debug build : False
CUDA used to build PyTorch : N/A
ROCM used to build PyTorch : 6.4.43483-a187df25c
==============================
Python Environment
==============================
Python version : 3.10.12 (main, Feb 4 2025, 14:57:36) [GCC 11.4.0] (64-bit runtime)
Python platform : Linux-5.15.0-116-generic-x86_64-with-glibc2.35
==============================
CUDA / GPU Info
==============================
Is CUDA available : True
CUDA runtime version : Could not collect
CUDA_MODULE_LOADING set to : LAZY
GPU models and configuration : AMD Instinct MI300X (gfx942:sramecc+:xnack-)
Nvidia driver version : Could not collect
cuDNN version : Could not collect
HIP runtime version : 6.4.43483
MIOpen runtime version : 3.4.0
Is XNNPACK available : True
==============================
Versions of relevant libraries
==============================
[pip3] conch-triton-kernels==1.2.1
[pip3] numpy==2.2.6
[pip3] pyzmq==26.4.0
[pip3] torch==2.7.0+git6fd4078
[pip3] torchao==0.11.0
[pip3] torchaudio==2.7.0a0+654fee8
[pip3] torchvision==0.22.0+9eb57cd
[pip3] transformers==4.52.4
[pip3] triton==3.3.0
[conda] Could not collect
==============================
vLLM Info
==============================
ROCM Version : 6.4.43483-a187df25c
Neuron SDK Version : N/A
vLLM Version : 0.1.dev7675+gf184e89 (git sha: f184e89)
==============================
Environment Variables
==============================
TORCHINDUCTOR_MAX_AUTOTUNE_POINTWISE=1
NCCL_MIN_NCHANNELS=112
TORCHINDUCTOR_MAX_AUTOTUNE=1
PYTORCH_ROCM_ARCH=gfx942
TORCH_BLAS_PREFER_HIPBLASLT=1
LD_LIBRARY_PATH=/opt/rocm-6.4.1/lib:/usr/local/lib:
VLLM_USE_TRITON_FLASH_ATTN=0
NCCL_CUMEM_ENABLE=0
PYTORCH_NVML_BASED_CUDA_CHECK=1
TORCHINDUCTOR_COMPILE_THREADS=1
CUDA_MODULE_LOADING=LAZY
🐛 Describe the bug
pytest -svvvv tests/kernels/mamba/test_mamba_ssm_ssd.py::test_mamba_chunk_scan_cont_batch: Core dump
Fail:
:0:rocdevice.cpp :2991: 1164852989842 us: Callback: Queue 0x7f56c1800000 aborting with error : HSA_STATUS_ERROR_MEMORY_APERTURE_VIOLATION: The agent attempted to access memory beyond the largest legal address. code: 0x29
Fatal Python error: Aborted
File "/usr/local/lib/python3.10/dist-packages/_pytest/runner.py", line 117 in pytest_runtest_protocol
File "/usr/local/lib/python3.10/dist-packages/pluggy/_callers.py", line 121 in _multicall
File "/usr/local/lib/python3.10/dist-packages/pluggy/_manager.py", line 120 in _hookexec
File "/usr/local/lib/python3.10/dist-packages/pluggy/_hooks.py", line 512 in __call__
File "/usr/local/lib/python3.10/dist-packages/_pytest/main.py", line 367 in pytest_runtestloop
File "/usr/local/lib/python3.10/dist-packages/pluggy/_callers.py", line 121 in _multicall
File "/usr/local/lib/python3.10/dist-packages/pluggy/_manager.py", line 120 in _hookexec
File "/usr/local/lib/python3.10/dist-packages/pluggy/_hooks.py", line 512 in __call__
File "/usr/local/lib/python3.10/dist-packages/_pytest/main.py", line 343 in _main
File "/usr/local/lib/python3.10/dist-packages/_pytest/main.py", line 289 in wrap_session
File "/usr/local/lib/python3.10/dist-packages/_pytest/main.py", line 336 in pytest_cmdline_main
File "/usr/local/lib/python3.10/dist-packages/pluggy/_callers.py", line 121 in _multicall
File "/usr/local/lib/python3.10/dist-packages/pluggy/_manager.py", line 120 in _hookexec
File "/usr/local/lib/python3.10/dist-packages/pluggy/_hooks.py", line 512 in __call__
File "/usr/local/lib/python3.10/dist-packages/_pytest/config/__init__.py", line 175 in main
File "/usr/local/lib/python3.10/dist-packages/_pytest/config/__init__.py", line 201 in console_main
File "/usr/local/bin/pytest", line 33 in <module>
Extension modules: numpy._core._multiarray_umath, numpy.linalg._umath_linalg, torch._C, torch._C._dynamo.autograd_compiler, torch._C._dynamo.eval_frame, torch._C._dynamo.guards, torch._C._dynamo.utils, torch._C._fft, torch._C._linalg, torch._C._nest
ed, torch._C._nn, torch._C._sparse, torch._C._special, zstandard.backend_c, charset_normalizer.md, yaml._yaml, PIL._imaging, regex._regex, markupsafe._speedups, sklearn.__check_build._check_build, scipy._lib._ccallback_c, numpy.random._common, numpy
.random.bit_generator, numpy.random._bounded_integers, numpy.random._mt19937, numpy.random.mtrand, numpy.random._philox, numpy.random._pcg64, numpy.random._sfc64, numpy.random._generator, scipy.sparse._sparsetools, _csparsetools, scipy.sparse._cspar
setools, scipy.linalg._fblas, scipy.linalg._flapack, scipy.linalg.cython_lapack, scipy.linalg._cythonized_array_utils, scipy.linalg._solve_toeplitz, scipy.linalg._decomp_lu_cython, scipy.linalg._matfuncs_sqrtm_triu, scipy.linalg._matfuncs_expm, scip
y.linalg._linalg_pythran, scipy.linalg.cython_blas, scipy.linalg._decomp_update, scipy.sparse.linalg._dsolve._superlu, scipy.sparse.linalg._eigen.arpack._arpack, scipy.sparse.linalg._propack._spropack, scipy.sparse.linalg._propack._dpropack, scipy.s
parse.linalg._propack._cpropack, scipy.sparse.linalg._propack._zpropack, scipy.sparse.csgraph._tools, scipy.sparse.csgraph._shortest_path, scipy.sparse.csgraph._traversal, scipy.sparse.csgraph._min_spanning_tree, scipy.sparse.csgraph._flow, scipy.sp
arse.csgraph._matching, scipy.sparse.csgraph._reordering, psutil._psutil_linux, psutil._psutil_posix, scipy.special._ufuncs_cxx, scipy.special._ufuncs, scipy.special._specfun, scipy.special._comb, scipy.special._ellip_harm_2, scipy.spatial._ckdtree,
scipy._lib.messagestream, scipy.spatial._qhull, scipy.spatial._voronoi, scipy.spatial._distance_wrap, scipy.spatial._hausdorff, scipy.spatial.transform._rotation, scipy.optimize._group_columns, scipy.optimize._trlib._trlib, scipy.optimize._lbfgsb
Thread 0x00007f60fbd081c0 (most recent call first):
File "/usr/local/lib/python3.10/dist-packages/torch/cuda/__init__.py", line 1040 in synchronize
File "/usr/local/lib/python3.10/dist-packages/triton/testing.py", line 146 in do_bench File "/usr/local/lib/python3.10/dist-packages/triton/runtime/autotuner.py", line 170 in _bench
File "/usr/local/lib/python3.10/dist-packages/triton/runtime/autotuner.py", line 192 in <dictcomp>
File "/usr/local/lib/python3.10/dist-packages/triton/runtime/autotuner.py", line 192 in run
File "/usr/local/lib/python3.10/dist-packages/triton/runtime/jit.py", line 348 in <lambda>
File "/app/upstreamupgradeaiter/mtp-v1/vllm/model_executor/layers/mamba/ops/ssd_chunk_state.py", line 712 in chunk_state_varlen File "/app/upstreamupgradeaiter/mtp-v1/vllm/model_executor/layers/mamba/ops/ssd_combined.py", line 155 in _mamba_chunk_scan_combined_fwd
File "/app/upstreamupgradeaiter/mtp-v1/vllm/model_executor/layers/mamba/ops/ssd_combined.py", line 208 in mamba_chunk_scan_combined
File "/app/upstreamupgradeaiter/mtp-v1/tests/kernels/mamba/test_mamba_ssm_ssd.py", line 281 in test_mamba_chunk_scan_cont_batch
File "/usr/local/lib/python3.10/dist-packages/_pytest/python.py", line 156 in pytest_pyfunc_call
File "/usr/local/lib/python3.10/dist-packages/pluggy/_callers.py", line 121 in _multicall
File "/usr/local/lib/python3.10/dist-packages/pluggy/_manager.py", line 120 in _hookexec
File "/usr/local/lib/python3.10/dist-packages/pluggy/_hooks.py", line 512 in __call__
File "/usr/local/lib/python3.10/dist-packages/_pytest/python.py", line 1670 in runtest
File "/usr/local/lib/python3.10/dist-packages/_pytest/runner.py", line 178 in pytest_runtest_call
File "/usr/local/lib/python3.10/dist-packages/pluggy/_callers.py", line 121 in _multicall
File "/usr/local/lib/python3.10/dist-packages/pluggy/_manager.py", line 120 in _hookexec
File "/usr/local/lib/python3.10/dist-packages/pluggy/_hooks.py", line 512 in __call__
File "/usr/local/lib/python3.10/dist-packages/_pytest/runner.py", line 246 in <lambda>
File "/usr/local/lib/python3.10/dist-packages/_pytest/runner.py", line 344 in from_call
File "/usr/local/lib/python3.10/dist-packages/_pytest/runner.py", line 245 in call_and_report
File "/usr/local/lib/python3.10/dist-packages/_pytest/runner.py", line 136 in runtestprotocol
_moduleTNC, scipy.optimize._moduleTNC, scipy.optimize._cobyla, scipy.optimize._slsqp, scipy.optimize._minpack, scipy.optimize._lsq.givens_elimination, scipy.optimize._zeros, scipy.optimize._cython_nnls, scipy._lib._uarray._uarray, scipy.linalg._deco
mp_interpolative, scipy.optimize._bglu_dense, scipy.optimize._lsap, scipy.optimize._direct, scipy.integrate._odepack, scipy.integrate._quadpack, scipy.integrate._vode, scipy.integrate._dop, scipy.integrate._lsoda, scipy.interpolate._fitpack, scipy.i
nterpolate._dfitpack, scipy.interpolate._dierckx, scipy.interpolate._ppoly, scipy.interpolate._interpnd, scipy.interpolate._rbfinterp_pythran, scipy.interpolate._rgi_cython, scipy.interpolate._bspl, scipy.special.cython_special, scipy.stats._stats,
scipy.stats._sobol, scipy.stats._qmc_cy, scipy.stats._biasedurn, scipy.stats._stats_pythran, scipy.stats._levy_stable.levyst, scipy.stats._ansari_swilk_statistics, scipy.stats._mvn, scipy.stats._rcont.rcont, scipy.ndimage._nd_image, scipy.ndimage._r
ank_filter_1d, _ni_label, scipy.ndimage._ni_label, pyarrow.lib, pandas._libs.tslibs.ccalendar, pandas._libs.tslibs.np_datetime, pandas._libs.tslibs.dtypes, pandas._libs.tslibs.base, pandas._libs.tslibs.nattype, pandas._libs.tslibs.timezones, pandas.
_libs.tslibs.fields, pandas._libs.tslibs.timedeltas, pandas._libs.tslibs.tzconversion, pandas._libs.tslibs.timestamps, pandas._libs.properties, pandas._libs.tslibs.offsets, pandas._libs.tslibs.strptime, pandas._libs.tslibs.parsing, pandas._libs.tsli
bs.conversion, pandas._libs.tslibs.period, pandas._libs.tslibs.vectorized, pandas._libs.ops_dispatch, pandas._libs.missing, pandas._libs.hashtable, pandas._libs.algos, pandas._libs.interval, pandas._libs.lib, pyarrow._compute, pandas._libs.ops, nume
xpr.interpreter, pandas._libs.hashing, pandas._libs.arrays, pandas._libs.tslib, pandas._libs.sparse, pandas._libs.internals, pandas._libs.indexing, pandas._libs.index, pandas._libs.writers, pandas._libs.join, pandas._libs.window.aggregations, pandas
._libs.window.indexers, pandas._libs.reshape, pandas._libs.groupby, pandas._libs.json, pandas._libs.parsers, pandas._libs.testing, sklearn.utils._isfinite, sklearn.utils.sparsefuncs_fast, sklearn.utils.murmurhash, sklearn.utils._openmp_helpers, skle
arn.metrics.cluster._expected_mutual_info_fast, sklearn.preprocessing._csr_polynomial_expansion, sklearn.preprocessing._target_encoder_fast, sklearn.metrics._dist_metrics, sklearn.metrics._pairwise_distances_reduction._datasets_pair, sklearn.utils._
cython_blas, sklearn.metrics._pairwise_distances_reduction._base, sklearn.metrics._pairwise_distances_reduction._middle_term_computer, sklearn.utils._heap, sklearn.utils._sorting, sklearn.metrics._pairwise_distances_reduction._argkmin, sklearn.metri
cs._pairwise_distances_reduction._argkmin_classmode, sklearn.utils._vector_sentinel, sklearn.metrics._pairwise_distances_reduction._radius_neighbors, sklearn.metrics._pairwise_distances_reduction._radius_neighbors_classmode, sklearn.metrics._pairwis
e_fast, zmq.backend.cython._zmq, PIL._imagingft, hiredis.hiredis, msgspec._core, pybase64._pybase64, multidict._multidict, yarl._quoting_c, propcache._helpers_c, aiohttp._http_writer, aiohttp._http_parser, aiohttp._websocket.mask, aiohttp._websocket
.reader_c, frozenlist._frozenlist, hip_utils, __triton_launcher (total: 192)
Aborted
Before submitting a new issue...
- Make sure you already searched for relevant issues, and asked the chatbot living at the bottom right corner of the documentation page, which can answer lots of frequently asked questions.
Metadata
Metadata
Assignees
Labels
bugSomething isn't workingSomething isn't workingstaleOver 90 days of inactivityOver 90 days of inactivity