Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add ability to interrupt computation #378

Closed
Keno opened this issue May 26, 2014 · 6 comments
Closed

Add ability to interrupt computation #378

Keno opened this issue May 26, 2014 · 6 comments

Comments

@Keno
Copy link
Contributor

Keno commented May 26, 2014

In dynamic languages, it is often useful to be able to cancel computations interactively. This almost works in openblas, except for two issues:

  1. The worker threads keep running
  2. Allocated memory may not be freed.

I tried a method whereby threads would reset to a save position through stack unwinding, but quite frankly that hack was so ugly that even I was ashamed of it (and I usually like ugly hacks). There is several options here. The simplest of course would be to simply have a global cancel flag and then the user can ask nicely to cancel the computations. This would involve checking that flag periodically and then canceling the computation. You should make sure that the flag is called sufficiently often to not be frustrating (I think 100ms - 1s would still be ok for large matrices).

The other option of course if to have a method that forcefully terminates all running threads, restarts them, frees and reallocates their memory, etc. I think this is the more economical option, though you would have to look at how to be able to free and reallocate all buffers, but that doesn't seem impossible.

Let me know what you think and if you need any help.

@vtjnash
Copy link
Contributor

vtjnash commented May 28, 2014

Is there an option (c)? Provide an API that calls pthread_kill on each worker thread, where each worker thread responds by freeing working memory and doing a longjmp back to the sleep location

@xianyi
Copy link
Collaborator

xianyi commented Jan 4, 2015

@Keno ,@vtjnash , @stevengj .

Sorry, I didn't read this issue before.
I think OpenBLAS can support option 2 ("have a method that forcefully terminates all running threads") or option 3 ("Provide an API that calls pthread_kill on each worker thread")

@stevengj
Copy link
Contributor

stevengj commented Jan 5, 2015

@xianyi, thanks for looking into this. For the pthread_kill option, how difficult is it for each worker thread to have a signal handler that deallocates working memory? Is working memory stored in easily-accessible globals in each thread (as opposed to local vars?) @vtjnash, will the pthread_kill approach work on Windows, where signal support is reputedly a bit thin?

@xianyi
Copy link
Collaborator

xianyi commented Jan 8, 2015

@stevengj , OpenBLAS creates a threading pool at beginning. In the pool, every work thread allocates the memory at first. Then, the thread waits for the workload. It's possible to write a signal handler for the work thread.

@vtjnash
Copy link
Contributor

vtjnash commented Jan 8, 2015

We effectively mimic pthread_kill in Julia, so it is possible: https://github.com/JuliaLang/julia/blob/2bb647a59229f5d6100b54f1328d33dd818c3a94/src/init.c#L305-L329

@stevengj
Copy link
Contributor

stevengj commented Jan 8, 2015

If the threads allocate all their buffers when they are created, then presumably it is not necessary to free these buffers when an operation is cancelled? If the buffers persist over the lifetime of the thread, then we just need to put the thread back into a wait state.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

6 participants