-
Notifications
You must be signed in to change notification settings - Fork 9
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Parallel execution of scalacheck test cases for streaming properties #19
Comments
The Vector of DStreams cannot be replaced by a TheadLocal variable because the driver thread needs to access each of the DStreams in the vector |
If we are already specifying the number of workers outside the prop, we could think of that not as a number of workers but as number of multiplexed test cases, and work in a single thread as in actorSendingPropMultiplex from https://github.com/juanrh/sscheck/blob/streamingDataSendExperiments/src/test/scala/es/ucm/fdi/sscheck/spark/streaming/StreamingContextActorReceiverTest.scala. This way the program is simpler and only one thread has access to the vector of DStreams, and anyway if we want to parallelize then we create the threads or its pool and have more control over them. Some care must be taken to avoid blocking the driver thread |
DStreamProp.forAllAlways executes the test cases sequentially, which is ok for local tests and fine tuned batch interval. Adapting the ideas of StreamingContextDirectReceiverTest the tests maybe could be executed in parallel. But instead of using a single inputDStream that multiplexes the test cases by using pairs, which leads to not being able to use DStream transformers as test subjects, the idea is using a Vector (or other fast immutable indexed Seq) of DStreams. We need access to the number of workers as an implicit value wrapping Int (like Parallelism), and we give each worker an index in that DStream with something like:
This way each worker generates data for a different DStream, hopefully eliminating thread safety problems, and the action checking the result just iterates on the Vector of DStreams, in a way similar to what it is done in StreamingContextDirectReceiverTest
The text was updated successfully, but these errors were encountered: