-
Notifications
You must be signed in to change notification settings - Fork 6.1k
Insights: ray-project/ray
Overview
Could not load contribution data
Please try again later
123 Pull requests merged by 39 people
-
[Data] Always pin the seed when doing file-based random shuffle
#50924 merged
Mar 11, 2025 -
Run workflow tests on postmerge only
#51247 merged
Mar 11, 2025 -
Skip flaky workflow tests
#51245 merged
Mar 11, 2025 -
[core] Remove "serve" and "tune" dependencies from
test_runtime_env_complicated
#51246 merged
Mar 11, 2025 -
clean up shutdown behavior of serve
#51009 merged
Mar 11, 2025 -
[core] Use current node id if no node id specified for ray drain-node
#51134 merged
Mar 11, 2025 -
[core] Skip
setproctitle
tests on non-Linux machines#51229 merged
Mar 11, 2025 -
[core] Fix client_connection test windows by trying old port
#51232 merged
Mar 11, 2025 -
[train v2][doc] Update user guides for metrics, checkpoints, results, and experiment tracking
#51204 merged
Mar 11, 2025 -
[core] [easy] [noop] Avoid duplicate function
#51209 merged
Mar 11, 2025 -
Sync (most) CI linters to precommit
#51181 merged
Mar 11, 2025 -
[tune] Fix
RunConfig
deprecation message in Tune being emitted intrainer.fit
usage#51198 merged
Mar 11, 2025 -
Avoid
setup-dev.py
writing togenerated/
#51194 merged
Mar 11, 2025 -
[ray.llm] Refactor download utilities
#51225 merged
Mar 11, 2025 -
[core][refactor] Move
node_stats_to_dict
toutils.py
to avoid importing unnecessary modules#51187 merged
Mar 10, 2025 -
[Data] Change default batch size from 1024 to
None
#50564 merged
Mar 10, 2025 -
[core] Deflake test_object_reconstruction_pending_creation
#51224 merged
Mar 10, 2025 -
[Data] Make chunk combination threshold configurable for improved per…
#51200 merged
Mar 10, 2025 -
[core] Mask tune directory for core tests
#51186 merged
Mar 10, 2025 -
[core] Refactor
test_run_driver_twice
to not use tune#51220 merged
Mar 10, 2025 -
[core][autoscaler] Add Pod names to the output of
ray status -v
#51192 merged
Mar 10, 2025 -
[doc][core][cgraph] Add bind() to API page
#51044 merged
Mar 10, 2025 -
[core][cgraph] Clean up CompiledDAG.actor_refs
#51174 merged
Mar 10, 2025 -
[core] Mask rllib directory for core tests
#50618 merged
Mar 10, 2025 -
[core][cgraph][doc]Fix cgraphs gpu docs code again
#51208 merged
Mar 10, 2025 -
[core] Wait for
DisconnectClientReply
in worker shutdown sequence#51033 merged
Mar 10, 2025 -
[RLlib] Unify namings for actor managers' outstanding in-flight requests metrics.
#51159 merged
Mar 10, 2025 -
[core] Remove AIR util from
test_ray_init_2
#51215 merged
Mar 10, 2025 -
[core] Move
ray.experimental.tf_utils
back torllib
#51183 merged
Mar 10, 2025 -
[CI] use smaller object_manager deps in targets
#51062 merged
Mar 10, 2025 -
[core] Acquire GIL before calling PyGILState_Release
#51203 merged
Mar 10, 2025 -
[core] Cover cpplint for
/src/ray/object_manager
(excludingplasma
)#51089 merged
Mar 10, 2025 -
[core][compiled graphs] Controllably destroy CUDA events in
GPUFuture
s#51090 merged
Mar 10, 2025 -
[core] Changing default tensor serialization in compiled graphs
#50778 merged
Mar 9, 2025 -
[core] Remove unused AsyncDrainNode
#51043 merged
Mar 9, 2025 -
[core][dashboard] Lazy init
NotifyQueue
#51178 merged
Mar 9, 2025 -
[train v2] Create a default
ScalingConfig
if one is not provided to the trainer#51093 merged
Mar 9, 2025 -
Remove tune-specific task events test
#51185 merged
Mar 8, 2025 -
[Data] Add fixed-size compute for read, write, and batch inference release tests
#51176 merged
Mar 8, 2025 -
[telemetry] Move library usage tests out of core
#51161 merged
Mar 8, 2025 -
[serve.llm] Trim probes and reorg the llm.serve release test folder structure
#51112 merged
Mar 8, 2025 -
[core] Use the correct way to check whether an actor task is running or not
#51158 merged
Mar 8, 2025 -
Fix UV hook to support Ray Job submission
#51150 merged
Mar 7, 2025 -
[train][v2] add log file paths to train state
#51085 merged
Mar 7, 2025 -
[train v2][doc] Add updated Train + Tune user guide
#51048 merged
Mar 7, 2025 -
Remove rllib dependency from core tests
#51171 merged
Mar 7, 2025 -
[core] Pass explicit handler for
ClientConnection
errors#51098 merged
Mar 7, 2025 -
[train v2][doc] Add updated fault tolerance user guide
#51083 merged
Mar 7, 2025 -
[data] fix the new dataset metrics
#51148 merged
Mar 7, 2025 -
[data] Handle schema mismatch in multiple blocks into one parquet file
#50915 merged
Mar 7, 2025 -
[core] Use
psutil.process_iter
to replacepsutil.pids
#51123 merged
Mar 7, 2025 -
[Core][Doc] Add Document for Python Standard Attributes
#51038 merged
Mar 7, 2025 -
[Data] Make num_blocks in repartition optional
#50997 merged
Mar 7, 2025 -
[serve] Fix bad
mock
import#51170 merged
Mar 7, 2025 -
[doc][core] fix a wrong url in ray-dag.rst
#49980 merged
Mar 7, 2025 -
[core][cgraph][docs] Fix cgraph gpu docs code
#51132 merged
Mar 7, 2025 -
[core] Block before running gpu tests
#51131 merged
Mar 7, 2025 -
[ray.data.llm] Add release test
#51063 merged
Mar 7, 2025 -
[core] Fix path comparison on windows
#51142 merged
Mar 7, 2025 -
[CI] Enable python-no-log-warn precommit rule
#51099 merged
Mar 7, 2025 -
[Fix][Core] Execute user requested actor exit in C++ side
#49918 merged
Mar 7, 2025 -
[data] fix non-unique operator id
#51108 merged
Mar 7, 2025 -
[core][cgraph] Fix doc test cgraph_overlap
#51139 merged
Mar 7, 2025 -
[data] Add dataset/operator state, progress, total metrics
#50770 merged
Mar 7, 2025 -
[core] Fix double checked locking
#51073 merged
Mar 6, 2025 -
[docs] Hide output of very verbose ipynb notebooks
#51140 merged
Mar 6, 2025 -
[Core][Autoscaler] Refactor v2 Log Formatting
#49350 merged
Mar 6, 2025 -
[serve] Exclude redirects from request error count
#51130 merged
Mar 6, 2025 -
[docs] change order of libraries based on popularity
#51136 merged
Mar 6, 2025 -
[Data] Fix error message for
override_num_blocks
when reading from a HuggingFaceDataset
#50998 merged
Mar 6, 2025 -
use correct mean and standard deviation norm values in image tutorials
#50240 merged
Mar 6, 2025 -
Fix typos in comments and strings
#51079 merged
Mar 6, 2025 -
[ci] Revert UV pip compile for LLM requirements
#51118 merged
Mar 6, 2025 -
[serve.llm] Added benchmark release tests
#51106 merged
Mar 6, 2025 -
Revert "[Doc] RayServe Single-Host TPU v6e Example with vLLM (#47814)"
#51113 merged
Mar 6, 2025 -
Discover TPU logs in Ray Dashboard
#47737 merged
Mar 6, 2025 -
[core] delete UNORDERED_VS_ABSL_MAPS_EVALUATION
#51115 merged
Mar 6, 2025 -
[core] Add comments explaining why
unique_ptr::release()
is used.#51055 merged
Mar 6, 2025 -
[RLlib; docs] Fix 4 broken html links.
#51015 merged
Mar 6, 2025 -
Revert "[core] Add a warning for non-contiguous numpy array deserialization"
#51111 merged
Mar 6, 2025 -
Various improvements to the Getting Started page
#50878 merged
Mar 6, 2025 -
[docs,tune] fix typo, and standardize equations across the two apis
#51114 merged
Mar 6, 2025 -
[doc] Add documentation for Asynchronous HyperBand Example in Tune
#50708 merged
Mar 6, 2025 -
Improvements to PBT example
#50870 merged
Mar 6, 2025 -
Various enhancements to the Gradio Ray Serve tutorial
#50276 merged
Mar 6, 2025 -
[bugfix] Fixed the issues with compile llm reqs script
#51110 merged
Mar 6, 2025 -
[core][cgraph][docs] Move cgraphs docs code to test
#51000 merged
Mar 6, 2025 -
Various improvements to Tune Pytorch CIFAR tutorial
#50316 merged
Mar 6, 2025 -
[core][cgraph][docs] Compiled Graph troubleshooting
#51030 merged
Mar 5, 2025 -
[train v2] Use the
FailurePolicy
factory#51067 merged
Mar 5, 2025 -
[core] Implement call once token
#51053 merged
Mar 5, 2025 -
[core] Cover cpplint for
src/ray/object_manager/plasma
#50954 merged
Mar 5, 2025 -
Improvements to HF Transformers example
#50896 merged
Mar 5, 2025 -
[Doc] RayServe Single-Host TPU v6e Example with vLLM
#47814 merged
Mar 5, 2025 -
[core] Enable
make_shared
in plasma client#51100 merged
Mar 5, 2025 -
Fixed non-runnable Optuna tutorial
#50404 merged
Mar 5, 2025 -
Various improvements to the Ray Tune XGBoost tutorial
#50455 merged
Mar 5, 2025 -
[doc] Update jemalloc profiling doc
#51031 merged
Mar 5, 2025 -
[core][2/N] Remove
redis_max_memory
from Ray core#51059 merged
Mar 5, 2025 -
Various enhancements to Tune Keras example:
#50581 merged
Mar 5, 2025 -
minor improvements to hyperopt tutorial
#50697 merged
Mar 5, 2025 -
various improvements to lightgbm tutorial:
#50704 merged
Mar 5, 2025 -
Improvements to Train DeepSpeed example
#50906 merged
Mar 5, 2025 -
[serve] Skip locality backoff unit test on windows
#51101 merged
Mar 5, 2025 -
[Docs][Data] Ordering of rows
#50986 merged
Mar 5, 2025 -
[core] Remove unnecessary
ClientHandler
#51097 merged
Mar 5, 2025 -
[core] Checkin opentelemetry C++ sdk dependency
#51077 merged
Mar 5, 2025 -
[RLlib] Add 'single_action_space' and 'single_observation_space' to 'VectorMultiAgentEnv`.
#51096 merged
Mar 5, 2025 -
[core] Fix plasma client memleak
#51051 merged
Mar 5, 2025 -
[data] Update loading-data.rst
#41972 merged
Mar 5, 2025 -
[core]
BoundedExecutor
is never used when an actor is async#51069 merged
Mar 5, 2025 -
[train] Deprecate torch amp wrapper utilities
#51066 merged
Mar 5, 2025 -
[train][v2] implement state export
#50622 merged
Mar 4, 2025 -
[data] Adding in another kwargs
#51068 merged
Mar 4, 2025 -
[LLM APIs] Fast follow up for 2.44 (2/N)
#51064 merged
Mar 4, 2025 -
[train] Remove ray storage dependency and deprecate
RAY_STORAGE
env var configuration option#50872 merged
Mar 4, 2025 -
[core] Don't build cpp api on pip install
#50499 merged
Mar 4, 2025 -
Fix typos in doc directory
#51054 merged
Mar 4, 2025 -
[Docs] update all hpu related docs
#51028 merged
Mar 4, 2025 -
[core] Add a warning for non-contiguous numpy array deserialization
#50731 merged
Mar 4, 2025 -
[train v2] Run tensorflow release test with v2 enabled
#51046 merged
Mar 4, 2025
63 Pull requests opened by 38 people
-
try remote debug
#51070 opened
Mar 4, 2025 -
[Docs] Update docs to reflect CPU requests/limits change in KubeRay v1.3
#51072 opened
Mar 4, 2025 -
[data] Make Dataset.name/set_name public
#51076 opened
Mar 4, 2025 -
[DONOTMERGE] POC for Ray+torch.distributed
#51078 opened
Mar 4, 2025 -
Fix the grammar of the OOM killer error messages
#51081 opened
Mar 5, 2025 -
[Data] Adding in metrics for number of actors alive, pending and restarting
#51082 opened
Mar 5, 2025 -
[ci] bazel lint all BUILD files - `python/`
#51092 opened
Mar 5, 2025 -
[Doc][KubeRay] Add a doc to explain why some worker Pods are not ready in RayService
#51095 opened
Mar 5, 2025 -
[ray.data.llm] Support S3 paths for model checkpoint and LoRA path
#51103 opened
Mar 5, 2025 -
Replace AMD device env var with HIP_VISIBLE_DEVICES
#51104 opened
Mar 5, 2025 -
[core] Integrate process-wise cgroup setup with task execution
#51107 opened
Mar 6, 2025 -
[DO NOT MERGE] Train v2 bug bash build
#51116 opened
Mar 6, 2025 -
[core][autoscaler][v1] do not removing nodes for upcoming placement groups
#51122 opened
Mar 6, 2025 -
[Core][Bug fix] Trigger local task scheduling after deleting bundle.
#51125 opened
Mar 6, 2025 -
[Data] Store average memory use per task in `OpRuntimeMetrics`
#51126 opened
Mar 6, 2025 -
[Data] Add `memory` attribute to `ExecutionResources`
#51127 opened
Mar 6, 2025 -
[do not merge] Add Daft to the Ray ecosystem page
#51133 opened
Mar 6, 2025 -
[VMware][WCP provider][Part 2/n]: Add vSphere WCP node provider
#51138 opened
Mar 6, 2025 -
[core] Check cgroupv2 mount status
#51141 opened
Mar 6, 2025 -
Reproducing MacOS x86_64 Test Failure w/ Custom Numpy Serializer for ndarrays
#51143 opened
Mar 6, 2025 -
add additional_log_standard_attrs to serve logging config
#51144 opened
Mar 7, 2025 -
Update CODEOWNERS
#51146 opened
Mar 7, 2025 -
[LLM Batch][Telemetry] Add telemetry for batch API
#51147 opened
Mar 7, 2025 -
[core] Implement a universal printer
#51151 opened
Mar 7, 2025 -
[Do Not Merge] Update the Test Script to Debug test_network_failure_e2e Flaky Test
#51153 opened
Mar 7, 2025 -
[Doc] Update vllm example with metrics
#51156 opened
Mar 7, 2025 -
[RLlib] Add timers to env step, forward pass, and complete connector pipelines runs.
#51160 opened
Mar 7, 2025 -
Bump axios from 0.21.4 to 1.8.2 in /python/ray/dashboard/client
#51162 opened
Mar 7, 2025 -
[DONT MERGE] try remote debug
#51175 opened
Mar 7, 2025 -
[train] train v1 export api
#51177 opened
Mar 8, 2025 -
[core] Integrate scoped dup2
#51179 opened
Mar 8, 2025 -
[Data] Support async callable classes in flat_map()
#51180 opened
Mar 8, 2025 -
[Core] Making Object Store Fallback Directory Configurable
#51189 opened
Mar 9, 2025 -
[core] Make testable stream redirection
#51191 opened
Mar 9, 2025 -
[core] Cover cpplint for `ray/tree/master/src/ray/gcs/gcs_server`
#51197 opened
Mar 9, 2025 -
[train v2] Improve `TrainingFailedError` message
#51199 opened
Mar 9, 2025 -
[train v2][doc] Update persistent storage guide
#51202 opened
Mar 9, 2025 -
[doc][core][cgraph] Complete Compiled Graph docs
#51206 opened
Mar 10, 2025 -
[image] Add cuda 12.8.0 in image building matrix
#51210 opened
Mar 10, 2025 -
[doc] Update vllm example support args without values, eg --enable-lora
#51212 opened
Mar 10, 2025 -
Bump jinja2 from 3.1.3 to 3.1.6 in /release
#51216 opened
Mar 10, 2025 -
[RLlib/Tune; docs] Fix broken PB2/RLlib example.
#51219 opened
Mar 10, 2025 -
[Serve.llm] create new telemetry tags for serve.llm
#51221 opened
Mar 10, 2025 -
[train v2][doc] Update API references
#51222 opened
Mar 10, 2025 -
Fix the logic to calculate the number of workers based on the TPU version.
#51227 opened
Mar 10, 2025 -
[Doc] Update documentation for `uv run`
#51233 opened
Mar 11, 2025 -
[docs, tune] replace reuse actors example with a fuller demonstration
#51234 opened
Mar 11, 2025 -
[serve.llm] Add gen-config to oss
#51235 opened
Mar 11, 2025 -
[data] Support `ray_remote_args_fn` in map_groups
#51236 opened
Mar 11, 2025 -
[Data] Abstracting common `shuffle` utility
#51237 opened
Mar 11, 2025 -
[Data] Avoid unnecessary conversion to Numpy when creating Arrow/Pandas blocks
#51238 opened
Mar 11, 2025 -
[DONT MERGE] Hjiang/cgroup on ci status test
#51239 opened
Mar 11, 2025 -
Update precommit hook
#51240 opened
Mar 11, 2025 -
[core] Mask train directory for core tests
#51248 opened
Mar 11, 2025 -
[Test][KubeRay] Add doctest for RayCluster Quickstart doc
#51249 opened
Mar 11, 2025 -
[dashboard] Skip failing subprocess `test_e2e`
#51253 opened
Mar 11, 2025 -
[Doc] update openai example to support LORA and multiple lora modules
#51254 opened
Mar 11, 2025 -
[RLlib] - Remove `self` from `staticmethod` in `Connector` in old API stack.
#51255 opened
Mar 11, 2025 -
Bump keras from 2.15.0 to 3.9.0 in /python
#51256 opened
Mar 11, 2025 -
[DO NOT MERGE] Debug CI error by unpinning pandas version in test_e2e_complex
#51257 opened
Mar 11, 2025 -
[ray.serve.llm] Fix release test
#51258 opened
Mar 11, 2025
42 Issues closed by 14 people
-
CI test linux://rllib:examples/connectors/mean_std_filtering_ppo is flaky
#47435 closed
Mar 11, 2025 -
`serve shutdown` command claims to have shut down the service even if one wasn't running.
#34018 closed
Mar 11, 2025 -
CI test windows://python/ray/tests:test_task_events_2 is consistently_failing
#45966 closed
Mar 11, 2025 -
CI test darwin://python/ray/tests:test_task_events_2 is consistently_failing
#51182 closed
Mar 11, 2025 -
CI test windows://python/ray/tests:test_actor_retry2 is flaky
#47415 closed
Mar 11, 2025 -
CI test linux://rllib:learning_tests_stateless_cartpole_appo_gpu is consistently_failing
#47295 closed
Mar 11, 2025 -
CI test linux://rllib:learning_tests_multi_agent_pendulum_sac_multi_gpu is flaky
#47309 closed
Mar 11, 2025 -
CI test darwin://python/ray/tests:test_state_api is consistently_failing
#46890 closed
Mar 11, 2025 -
CI test linux://rllib:learning_tests_multi_agent_stateless_cartpole_ppo_multi_cpu is flaky
#47313 closed
Mar 11, 2025 -
CI test darwin://python/ray/tests:test_reconstruction_2 is flaky
#50859 closed
Mar 11, 2025 -
[Ray Core] `ray.data.Dataset.repartition` not working consistently with doc/error message
#51060 closed
Mar 10, 2025 -
CI test linux://rllib:learning_tests_cartpole_dqn_multi_cpu is flaky
#47214 closed
Mar 10, 2025 -
CI test windows://python/ray/tests:test_reference_counting_2 is consistently_failing
#45964 closed
Mar 10, 2025 -
[serve] proxy memory leak
#50927 closed
Mar 10, 2025 -
[CI] Replace `:object_manager` with smaller Bazel targets.
#51014 closed
Mar 10, 2025 -
CI test windows://python/ray/tests:test_object_spilling_debug_mode is flaky
#43796 closed
Mar 9, 2025 -
CI test darwin://python/ray/tests:test_advanced_5 is flaky
#44005 closed
Mar 8, 2025 -
CI test linux://rllib:examples/learners/ppo_with_torch_lr_schedulers is flaky
#49181 closed
Mar 8, 2025 -
CI test darwin://python/ray/tests:test_runtime_env_working_dir_3 is consistently_failing
#44765 closed
Mar 8, 2025 -
[core] raylet memory leak
#50955 closed
Mar 8, 2025 -
CI test linux://doc:doc_code_cgraph_nccl is consistently_failing
#51119 closed
Mar 7, 2025 -
CI test linux://doc:doc_code_cgraph_profiling is consistently_failing
#51121 closed
Mar 7, 2025 -
[core][logging] Customizable Python standard log attributes.
#49502 closed
Mar 7, 2025 -
CI test linux://python/ray/serve/tests/unit:test_deployment_state is flaky
#49632 closed
Mar 7, 2025 -
CI test linux://python/ray/serve/tests/unit:test_deployment_state_with_metr_disab is flaky
#49638 closed
Mar 7, 2025 -
CI test linux://python/ray/serve/tests/unit:test_deployment_state_with_compact_scheduling is flaky
#49634 closed
Mar 7, 2025 -
[telemetry] Using only library APIs should not cause "core" usage to be reported
#51169 closed
Mar 7, 2025 -
[Core] Persist Ray runtime resources between sessions
#37377 closed
Mar 7, 2025 -
CI test windows://python/ray/tests:test_logging is consistently_failing
#51135 closed
Mar 7, 2025 -
CI test linux://doc:doc_code_cgraph_overlap is consistently_failing
#51120 closed
Mar 7, 2025 -
[Core] ray.actor.exit_actor() does not seem to work from within an async background thread
#49451 closed
Mar 7, 2025 -
[autoscaler] Refactor ray status output code
#37856 closed
Mar 6, 2025 -
[Docs] Tutorials should use correct normalization values for image datasets
#50239 closed
Mar 6, 2025 -
CI test linux://rllib:learning_tests_multi_agent_pendulum_sac_multi_cpu is flaky
#47264 closed
Mar 6, 2025 -
[core] ray.wait check failed on static_cast(ready.size()) <= num_objects
#51105 closed
Mar 6, 2025 -
[core] Cover cpplint for `src/ray/object_manager/plasma`
#50729 closed
Mar 5, 2025 -
CI test windows://python/ray/serve/tests/unit:test_pow_2_replica_scheduler is flaky
#47950 closed
Mar 5, 2025 -
[Data] Ordering of blocks after map and map_batches
#50890 closed
Mar 5, 2025 -
CI test linux://python/ray/air:test_tensor_extension is flaky
#50819 closed
Mar 5, 2025 -
pip cannot install ray on Alpine Linux
#30416 closed
Mar 5, 2025 -
[Ray Data] Typo in "PyTorch DataLoader arguments" documentation
#51074 closed
Mar 4, 2025
50 Issues opened by 34 people
-
Release test llm_serve_llama_3dot1_8B_lora failed
#51252 opened
Mar 11, 2025 -
Release test llm_serve_llama_3dot1_8B_quantized_tp_1 failed
#51251 opened
Mar 11, 2025 -
Release test llm_serve_llama_3dot1_8B_tp_2 failed
#51250 opened
Mar 11, 2025 -
ModuleNotFoundError: No module named 'ray.serve.llm'
#51244 opened
Mar 11, 2025 -
[Tune] Add examples of using XGBoost with the sklearn API and add examples of regression tasks
#51241 opened
Mar 11, 2025 -
[core] Move to bzlmod for handling dependencies
#51228 opened
Mar 10, 2025 -
CI test windows://src/ray/common/test:client_connection_test is consistently_failing
#51226 opened
Mar 10, 2025 -
[Data] `Dataset.train_test_split` reads dataset twice
#51223 opened
Mar 10, 2025 -
[Core] get_user_temp_dir() Doesn't Honor the User Specified Temp Dir
#51218 opened
Mar 10, 2025 -
[Data] Filter operation changes schema of dataset
#51217 opened
Mar 10, 2025 -
[Core] Ray core tasks tutorial not works. error msg: `Error Type: WORKER_DIED`
#51214 opened
Mar 10, 2025 -
[Serve] how to print log in terminal and save to single custom file path
#51213 opened
Mar 10, 2025 -
[Ray Core] For the same python test, the results of pytest and bazel are inconsistent
#51211 opened
Mar 10, 2025 -
[Data] Adding streaming capability for `ray.data.Dataset.unique`
#51207 opened
Mar 10, 2025 -
[Core] Add c++ API for setting max_task_retries
#51205 opened
Mar 10, 2025 -
[Core] Failed to use uv
#51196 opened
Mar 9, 2025 -
[Core] API Reference: uv
#51195 opened
Mar 9, 2025 -
[Ray Serve]: "RuntimeError: No CUDA GPUs are available" when running vllm with ray
#51193 opened
Mar 9, 2025 -
[Serve] Serve hangs for chained deployments with diamond dependencies data flow
#51190 opened
Mar 9, 2025 -
[core] Cover cpplint for `ray/tree/master/src/ray/gcs/gcs_server`
#51184 opened
Mar 8, 2025 -
[RFC] GPU object store support in Ray Core
#51173 opened
Mar 7, 2025 -
[telemetry] RLlib usage should not report Ray Train usage
#51168 opened
Mar 7, 2025 -
[telemetry] RLlib telemetry prior to `ray.init` is not reported
#51167 opened
Mar 7, 2025 -
[telemetry] RLlib should not report usage on import
#51166 opened
Mar 7, 2025 -
[telemetry] Importing Ray Tune in an actor reports Ray Train usage
#51165 opened
Mar 7, 2025 -
[telemetry] Ray Train import prior to `ray.init` isn't reported
#51164 opened
Mar 7, 2025 -
[telemetry] Ray Train should not be marked only from import
#51163 opened
Mar 7, 2025 -
[Ray debugger] Unable to use debugger on slurm cluster
#51157 opened
Mar 7, 2025 -
Custom policy
#51155 opened
Mar 7, 2025 -
[core] Place unit test alongside with the implementation
#51152 opened
Mar 7, 2025 -
[core] Named placement groups don't work
#51149 opened
Mar 7, 2025 -
[Serve] `ray.serve.exceptions.BackpressurError` raised from child cause `500` with FastAPI Ingress
#51145 opened
Mar 7, 2025 -
[RLLIB] Support for Gymnasium Graph Spaces
#51129 opened
Mar 6, 2025 -
[Core] RayCheck failed: placement_group_resource_manager_->ReturnBundle(bundle_spec) Status not OK
#51124 opened
Mar 6, 2025 -
[core] Unstable placement group GPU bundle order
#51117 opened
Mar 6, 2025 -
[Doc][Dashboard] Add Documentation about TPU Logs
#51102 opened
Mar 5, 2025 -
[Data]Extend Ray Data with read/write hive
#51094 opened
Mar 5, 2025 -
[ci] bazel lint all BUILD files - `python`
#51091 opened
Mar 5, 2025 -
[core] Integrate clang-tidy
#51088 opened
Mar 5, 2025 -
[core] Integrate gcov into our build system
#51087 opened
Mar 5, 2025 -
[core] Guard ray C++ code quality via unit test
#51086 opened
Mar 5, 2025 -
[core] Cover cpplint for `/src/ray/object_manager` (excluding `plasma`)
#51084 opened
Mar 5, 2025 -
[core] Split raylet cython file into multiple files
#51080 opened
Mar 5, 2025 -
[core] add tests to ensure the ConcurrencyGroupManager creates the correct number of threads
#51075 opened
Mar 4, 2025 -
[core] Only one of the threads in a thread pool will be initialized as a long-running Python thread
#51071 opened
Mar 4, 2025
85 Unresolved conversations
Sometimes conversations happen on old items that aren’t yet closed. Here is a list of all the Issues and Pull Requests with unresolved conversations.
-
[Compiled Graph] Enhance Compile Graph with Multi-Device Support and HCCL Integration
#51032 commented on
Mar 11, 2025 • 26 new comments -
[core] Utils to cleanup cgroup folder
#49941 commented on
Mar 7, 2025 • 25 new comments -
[RLlib] Balance package loads for `AggregatorActors`.
#51017 commented on
Mar 11, 2025 • 10 new comments -
[Autoscaler][V2] Check IM instance_status before terminating nodes
#50707 commented on
Mar 11, 2025 • 6 new comments -
[core][compiled graphs] Support reduce scatter and all gather collective for GPU communicator in compiled graph
#50624 commented on
Mar 11, 2025 • 6 new comments -
[train] Fold `v2.XGBoostTrainer` API into the public trainer class as an alternate constructor
#50045 commented on
Mar 11, 2025 • 3 new comments -
Integrate Ray Dataset with Daft Dataframe
#50630 commented on
Mar 10, 2025 • 3 new comments -
Fix ax_client.create_experiment call
#45902 commented on
Mar 7, 2025 • 2 new comments -
[wandb] Use wandb Run as a context manager
#49307 commented on
Mar 8, 2025 • 1 new comment -
[Autoscaler][Placement Group] Skip placed bundle when requesting resource
#48924 commented on
Mar 6, 2025 • 1 new comment -
[Docs][Core] Update system logs doc for dashboard subprocess module
#50984 commented on
Mar 6, 2025 • 1 new comment -
[core] Periodically check for unexpected worker socket disconnects
#50812 commented on
Mar 11, 2025 • 1 new comment -
[core] Populate obj store memory even if 0 + less copies in resource managers
#50637 commented on
Mar 4, 2025 • 1 new comment -
[Docs] Update Volcano Integration with The New Flag
#47911 commented on
Mar 4, 2025 • 0 new comments -
[doc] minor/patch version update
#48626 commented on
Mar 6, 2025 • 0 new comments -
(WIP) [core][compiled graphs] Unify code paths for NCCL P2P and collectives scheduling
#48649 commented on
Mar 5, 2025 • 0 new comments -
[core] cpp lint of object_manager
#48878 commented on
Mar 5, 2025 • 0 new comments -
[core] Lint cpp files in common
#49002 commented on
Mar 5, 2025 • 0 new comments -
[ADAG]Enable NPU (hccl) communication for CG
#47658 commented on
Mar 7, 2025 • 0 new comments -
[Doc][KubeRay] Add KubeRay image resize example to Ray doc page
#46447 commented on
Mar 7, 2025 • 0 new comments -
Ray IPv6 support
#44252 commented on
Mar 5, 2025 • 0 new comments -
[Data] Resolve block references asynchronously in separate thread
#41465 commented on
Mar 7, 2025 • 0 new comments -
[data] Make exceptions consistent when falling back to pandas
#39969 commented on
Mar 7, 2025 • 0 new comments -
[WIP] Add example how to computing embeddings with Ray Data
#39879 commented on
Mar 7, 2025 • 0 new comments -
[data] Fix map_batches on datasets with nested lists
#39869 commented on
Mar 7, 2025 • 0 new comments -
[Data] Consolidate default fault tolerance options
#39797 commented on
Mar 7, 2025 • 0 new comments -
Upgrade default AWS DLAMI
#39721 commented on
Mar 5, 2025 • 0 new comments -
[core] Gcs asio minor improvements
#49169 commented on
Mar 10, 2025 • 0 new comments -
[core] Don't get dashboard address after each dashboard connection failure
#49584 commented on
Mar 10, 2025 • 0 new comments -
adding distributional critic example
#49949 commented on
Mar 10, 2025 • 0 new comments -
[core] Introduce ConcurrentFlatMap and use for InMemoryStoreClient
#50375 commented on
Mar 10, 2025 • 0 new comments -
[data] add ClickHouse sink
#50377 commented on
Mar 4, 2025 • 0 new comments -
[Autoscaler][V2] Use running node instances to rate-limit upscaling
#50414 commented on
Mar 10, 2025 • 0 new comments -
[tune] Remove loguniform's base
#50415 commented on
Mar 8, 2025 • 0 new comments -
[RLlib] Enable spliting and zero padding of Dict observation
#50589 commented on
Mar 5, 2025 • 0 new comments -
[core] Mask data directory for core tests
#50617 commented on
Mar 8, 2025 • 0 new comments -
[chore] Delete unused build.sh
#50649 commented on
Mar 6, 2025 • 0 new comments -
[POC] Run arbitrary bash commands in doc as test
#50988 commented on
Mar 4, 2025 • 0 new comments -
Fix editorconfig option name
#50993 commented on
Mar 8, 2025 • 0 new comments -
Suppress type error
#50994 commented on
Mar 8, 2025 • 0 new comments -
Improvements to General Debugging guide
#51004 commented on
Mar 6, 2025 • 0 new comments -
[WIP][core][compiled graphs] Supporting allreduce on tuple of tensors
#51047 commented on
Mar 5, 2025 • 0 new comments -
[core] Always create a default executor
#51058 commented on
Mar 10, 2025 • 0 new comments -
[Feature] [Performance] [Docs] Disabling object spilling is not documented
#21998 commented on
Mar 9, 2025 • 0 new comments -
[Data] Heavy spilling with heterogenous compute
#44710 commented on
Mar 8, 2025 • 0 new comments -
[RLlib] make_multi_callbacks function cannot pass new api stack verification
#48089 commented on
Mar 7, 2025 • 0 new comments -
[Epic][CI] Migrate linter to Ruff
#47991 commented on
Mar 7, 2025 • 0 new comments -
No backend type associated with device type npu
#50516 commented on
Mar 7, 2025 • 0 new comments -
[Autoscaler] Autoscaler gets into infinite cycle of removing and adding nodes, never satisfies placement group
#50783 commented on
Mar 6, 2025 • 0 new comments -
[distributed debugger] exception in regular remote worker function leading to access violation when debugger connects
#51010 commented on
Mar 6, 2025 • 0 new comments -
[Core] cannot pass namespace package at runtime via py_modules
#50161 commented on
Mar 6, 2025 • 0 new comments -
[<Ray component: Data>] async `flat_map`
#50329 commented on
Mar 6, 2025 • 0 new comments -
[tune] ClientObjectRef is not found for client
#46747 commented on
Mar 5, 2025 • 0 new comments -
[Serve] FastAPI ingress does not work with composable routers
#50373 commented on
Mar 5, 2025 • 0 new comments -
[RLlib] Attribute error when trying to compute action after training Multi Agent PPO with New API Stack
#44475 commented on
Mar 5, 2025 • 0 new comments -
[serve] [llm] Known issues in Ray Serve LLM
#50931 commented on
Mar 5, 2025 • 0 new comments -
[Core][ADAG] Local mode for Ray ADAG
#47725 commented on
Mar 5, 2025 • 0 new comments -
[Serve] DeepSeek-R1 mode load stuck in H20
#50975 commented on
Mar 5, 2025 • 0 new comments -
bazel-lint all BUILD files
#50875 commented on
Mar 5, 2025 • 0 new comments -
[Core] [Slurm] Allow parallel startup of Ray workers on Slurm
#25819 commented on
Mar 5, 2025 • 0 new comments -
[core][compiled graphs] Support pinned memory for CPU <-> GPU transfers
#48086 commented on
Mar 5, 2025 • 0 new comments -
[Train] Unable to gain long-term access to S3 storage for training state/checkpoints when running on AWS EKS
#50823 commented on
Mar 5, 2025 • 0 new comments -
[core] Deserialize torch.Tensors to the correct device
#50134 commented on
Mar 4, 2025 • 0 new comments -
[Core] ray.init() hangs using Python 3.10.15 on Linux
#48625 commented on
Mar 4, 2025 • 0 new comments -
[data][tests] Add HuggingFace dataset to image comparison benchmark
#39665 commented on
Mar 7, 2025 • 0 new comments -
[Data] Add `read_images` benchmark for 100 million images
#38862 commented on
Mar 7, 2025 • 0 new comments -
[Data] Add `read_delta` API to read Delta format files
#38813 commented on
Mar 7, 2025 • 0 new comments -
CI test linux://python/ray/tests:test_runtime_env_complicated is flaky
#49674 commented on
Mar 11, 2025 • 0 new comments -
New RLlib API examples
#50897 commented on
Mar 11, 2025 • 0 new comments -
RLlib | Conncetor.py
#51039 commented on
Mar 11, 2025 • 0 new comments -
CI test windows://python/ray/tests:test_storage is consistently_failing
#48922 commented on
Mar 11, 2025 • 0 new comments -
CI test linux://rllib:learning_tests_cartpole_dqn_gpu is flaky
#46683 commented on
Mar 11, 2025 • 0 new comments -
Make sure precommit hook linter and CI matches
#50694 commented on
Mar 11, 2025 • 0 new comments -
[autoscaler] Better documentation on the node provider interface
#26087 commented on
Mar 11, 2025 • 0 new comments -
[Serve] exceptions raised by request timeout are inconsistent
#50992 commented on
Mar 11, 2025 • 0 new comments -
[CORE] importing ray closes logging handlers, breaking custom logging
#48846 commented on
Mar 10, 2025 • 0 new comments -
Release test random_shuffle.chaos failed
#49395 commented on
Mar 10, 2025 • 0 new comments -
Release test sort.chaos failed
#49765 commented on
Mar 10, 2025 • 0 new comments -
Release test sort.regular failed
#50417 commented on
Mar 10, 2025 • 0 new comments -
Release test random_shuffle.regular failed
#49383 commented on
Mar 10, 2025 • 0 new comments -
[core] Run more tests with ASAN in CI to avoid memory leak
#51057 commented on
Mar 10, 2025 • 0 new comments -
[Core] Spot preemption related retries do not count towards the max retries
#50640 commented on
Mar 10, 2025 • 0 new comments -
[Serve] Extend configuration of Serve autoscaler with custom metrics
#31540 commented on
Mar 10, 2025 • 0 new comments -
[core] Cover cpplint for all C++ folders
#50583 commented on
Mar 9, 2025 • 0 new comments -
[Feedback] Feedback for ray + uv
#50961 commented on
Mar 9, 2025 • 0 new comments