-
Notifications
You must be signed in to change notification settings - Fork 349
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Scheduling error in ray multi-machine cluster mode #24
Comments
You should set a sp = smallpond.init(data_root="shared/path") In your case it is not set, and the default value is in your home path. |
Well, I have solved this problem, but it seems that data_root needs to be set to the directory where 3FS or HDFS mounts fuse. This ensures that the content of data_root will be synchronized to other machine nodes when a session is initialized. In fact, other ray machine nodes will not actively create the data_root directory of smallpond, so they need to rely on the distributed file system for synchronization,3FS or other |
Hello, I have the same problem. How did you solve it? |
When I deploy smallpond on two machines and execute tasks on machine A, when I schedule the task to another machine B, an error is reported that the file path cannot be found.
I checked the file path. This data path is generated when machine A is initialized, but this data path is also used when executing tasks on machine B. The initial data path of machine B should be different from that of machine A.
The text was updated successfully, but these errors were encountered: