Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

platform compatibility: Windows -> "cat" not available #11

Open
spotlightgit opened this issue Apr 24, 2020 · 8 comments
Open

platform compatibility: Windows -> "cat" not available #11

spotlightgit opened this issue Apr 24, 2020 · 8 comments

Comments

@spotlightgit
Copy link

Hello Oliver,

my SSH Connection is working between both PCs, that means I can ssh without entering a Password, which is one enabler for distributed computing with your wonderful Toolbox.
Unfortunately the "cat" command is not known to the Windows command line, which is called by the "system" command from MATLAB.
Possible workarounds:
1.) "cat" is known at Windows PowerShell
-> Seems interesting to call PowerShell instead of command line, but
!powershell cat … or
!powershell -inputformat none cat ...
are both not working on my Matlab (don't know why).
2.) replace "cat" with "type"
-> It seems to have the same functionality like "cat" on Linux Systems. Type is working at command line and PowerShell.

To be compatible to Linux and Windows it could be possible to check with "ispc" and than execute
system(sprintf('type …
or
system(sprintf('cat …
Instead of usage of "ispc" is also an additional Option possible for "batch_job_distrib()"

What do you think?

@spotlightgit
Copy link
Author

maybe it works if following changes are done:
start_workers.m:

% Copy the command file
if ispc
   [status, cmdout] = system(sprintf('scp %s %s:./batch_job_distrib_cmd.bat', cmd_file, workers{w,1}));
else
   [status, cmdout] = system(sprintf('cat %s | ssh %s "cat - > ./batch_job_distrib_cmd.bat"', cmd_file, workers{w,1}));
end

and

% Make it executable
if ~ispc
   [status, cmdout] = system(sprintf('ssh %s "chmod u+x batch_job_distrib_cmd.bat"', workers{w,1}));
   assert(status == 0, cmdout);
end

and

% Add on the ssh command
if ispc
   cmd = sprintf('ssh %s batch_job_distrib_cmd.bat', workers{w,1});
else
   cmd = sprintf('ssh %s ./batch_job_distrib_cmd.bat', workers{w,1});
end

Batch_job_distrib.m

% Remove the command file
try
   if ispc
      [status, cmdout] = system(sprintf('ssh %s "del batch_job_distrib_cmd.bat"', workers{w,1}));
   else
      [status, cmdout] = system(sprintf('ssh %s "rm -f ./batch_job_distrib_cmd.bat"', workers{w,1}));
   end
   assert(status == 0, cmdout);
catch me

@ojwoodford
Copy link
Owner

Many thanks for the input here. One issue I see is that ispc() tells you wether the master is a PC, but not the worker. The master could be a PC and the worker could be running linux.

@spotlightgit
Copy link
Author

Well, I understand your worries. In my case master and worker are both Windows systems. I think this could be solved by adding a new input argument, where the user can select which operating system the worker has. Maybe also an extension of the "workers" option -> "hostname, number of worker, system". Than it is possible to use different operating systems as workers. If not further specified the system of the master could be used as default for the workers.
An automatic detection would be the best solution :-) But maybe too much effort for this kind of issue ...

@spotlightgit
Copy link
Author

Hey Oliver, have you decided already how you want to continue with this issue? :-)

@ojwoodford
Copy link
Owner

No. It's a bit tricky. Ideally I'd like to get rid of the file, and just send the command over ssh. But not yet sure how to do this in a platform agnostic way. I'm open to input.

@spotlightgit
Copy link
Author

Well, as you mentioned it should be possible to send the commands directly instead of starting a batch file. I am working with MATLAB at Windows, therfore I have no experience regarding the differences between both operating systems for running MATLAB.

@spotlightgit
Copy link
Author

Hey Oliver, with my fast running test function I was thinking distributed computing is working at Windows (with my adoptions above), but it is not. It seems that SSH at Windows and Linux have different behaviour. Apparently all applications which are started within the SSH session are closed after closing of the SSH session. In your actual implementation the execution of the batch file is a single command which opens a SSH connection and close it immediately afterwards. Therefore the started MATLAB is also closed immediately.
It looks like I missed this behaviour at my initial testing with a fast goal function. Probably the master Matlab was doing the number crunching instead of the workers and I got no errors. Now with my slow goal function I realized this issue.
Have you ever used your toolbox at Windows? Do you have any suggestions how this issue could be solved?

@ojwoodford
Copy link
Owner

I have used it on Windows, but several years ago. The process needs to be disconnected from the shell. I believe the way to do this is using the start command: https://superuser.com/questions/1069972/windows-run-process-on-background-after-closing-cmd/1069983
However, I'm doing this, and you say it doesn't work. It needs further investigation. Unfortunately I don't have time to do this at present.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants