-
Notifications
You must be signed in to change notification settings - Fork 29k
[SPARK-16438] Add Asynchronous Actions documentation #14104
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
@AmplabJenkins I dont know i just create the issue on JIRA and make changes and make pull request for merge , I think it is valid pull request because in real world all application need non blocking execution. So if its is valid then merge it. |
|
@phalodi it's a bot, not a person. |
|
Jenkins test this please |
|
Test build #61980 has finished for PR 14104 at commit
|
|
Test build #61979 has finished for PR 14104 at commit
|
|
Can you paste a screenshot of the doc? |
|
@rxin Below are screen shot of Asynchronous Actions inside actions and screen shot of index of programming guide. |
docs/programming-guide.md
Outdated
| </table> | ||
|
|
||
| #### Asynchronous Actions | ||
| Spark provide asynchronous actions to execute two or more actions concurrently, these actions are execute asynchronously without blocking each other. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"without blocking each other": actions never block each other anyway. They block the calling thread.
|
Mostly this just duplicates the non-async documentation. I think it's worth mentioning the existence of the async methods with a pointer to the API docs, but, I don't think this table adds value. |
|
@srowen yeah you are right its not blocking calling thread but they execute sequentially right but in async action its return future so its running on different threads soo not run sequentially soo its fully non blocking example if i have action in that we have some transformation like map n inside map i have blocking code then how its work its block or not? |
|
@srowen yeah i agree with you its same like non async actions but yeah we should list it because its more useful in real life application when single application running multiple jobs of different users. so we should list it so user will more familiar with it and also know which are the async actions. i think its helpful for users still if you think its not a valid point then just tell me i change pull request and add just reference for async actions but i want user should know because when i first time see it i m surprise that in whole documentation no where mention about async actions. |
|
I fixed my comment above, sorry. They do block the calling thread, but actions triggered in different threads on one RDD do not block each other. The first sentence isn't correct. My suggestion is to omit the table, not all of the text you added. |
|
@srowen ok no problem so what you suggest should i remove the table and just add the line below the actions table and give link to scala and java docs? And the line "Spark provide asynchronous actions to execute two or more actions concurrently, these actions are execute asynchronously without blocking each other." is not correct so should i add that to "return future to non-block calling thread?" even in above statement my intention is also talk about calling thread because when we call action it will always in calling thread so calling thread will block but when we use async actions it will not block so we are targeted here non-blocking for calling thread because actions are called in calling thread. so i make changes and push it again as u suggest just add single line at end of actions and give link for scala and java doc api. Its cool? |
|
@srowen i make changes as you suggest i remove the table and add a single below actions table and also change statement to be more clear about non blocking. |
docs/programming-guide.md
Outdated
| </tr> | ||
| </table> | ||
|
|
||
| Spark provide asynchronous actions to execute two or more actions concurrently, these actions are execute asynchronously without blocking calling thread. Refer to the RDD API doc for asynchronous actions ([Scala](api/scala/index.html#org.apache.spark.rdd.AsyncRDDActions),[Java](api/java/org/apache/spark/rdd/AsyncRDDActions.html)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I still don't think this is accurate. Spark can execute actions concurrently without this API. This merely makes the call non-blocking for the caller. It really adds very little beyond calling the normal API and wrapping in a Future (though it's more useful for the Java API where that's harder). Hence why I thought it was unofficially deprecated.
|
@srowen so what do you think we should not add this api reference in document? |
|
@srowen i know its nothing major just call normal functions in future but still naive user first time who learn scala and spark dont know what are future and all so at least we should add reference of it. if you suggest i simple add reference link without adding any statement of blocking and non blocking. so atleast user know there is some api like this. |
|
Although I don't think it's essential to mention this API (it's in the API doc) it wouldn't hurt to point to it here. The mention should be accurate though. I think it's correct to say that the Spark RDD API also exposes asynchronous versions of some actions like foreach, which return a handle immediately to the caller which can be used to wait for its completion. |
|
@srowen sure i will make appropriate changes and push it again |
|
@srowen now its perfect i think at last from table to one line but yeah you are right it good to just mention that spark provide it and not duplicate table of actions. |
docs/programming-guide.md
Outdated
| </tr> | ||
| </table> | ||
|
|
||
| Spark RDD API also exposes asynchronous versions of some actions like foreach, collect, count etc. which return a handle immediately to the caller which can be used to wait for its completion. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have a few edits to suggest so let me just propose some text directly:
The Spark RDD API also exposes asynchronous versions of some actions, like foreachAsync for foreach, which immediately return a FutureAction to the caller instead of blocking on completion of the action. This can be used to manage or wait for the asynchronous execution of the action.
|
@srowen Thanks man i know i am not giving correct language but my intention is also same i hope its final commit :) if its done can you merge it. |
docs/programming-guide.md
Outdated
| </tr> | ||
| </table> | ||
|
|
||
| The Spark RDD API also exposes asynchronous versions of some actions, like foreachAsync for foreach, which immediately return a FutureAction to the caller instead of blocking on completion of the action. This can be used to manage or wait for the asynchronous execution of the action. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Last nit pick: back-tick-quote the code nouns like I did.
|
@srowen I put back-tick-quote for foreach, foreachAsync and FutureAction. |
|
@srowen Below is final screen shot as you suggested i hope its looks good please review it and merge it. |
|
@srowen Is this correct?? if it is then please merge it otherwise give comment if needed. |
|
@phalodi no need to ping. We don't merge things immediately, especially when non-essential. It gives more time for review. |
|
Jenkins retest this please |
|
Test build #62231 has finished for PR 14104 at commit
|
## What changes were proposed in this pull request? Add Asynchronous Actions documentation inside action of programming guide ## How was this patch tested? check the documentation indentation and formatting with md preview. Author: sandy <phalodi@gmail.com> Closes #14104 from phalodi/SPARK-16438. (cherry picked from commit bf107f1) Signed-off-by: Sean Owen <sowen@cloudera.com>





What changes were proposed in this pull request?
Add Asynchronous Actions documentation inside action of programming guide
How was this patch tested?
check the documentation indentation and formatting with md preview.