This sample uses network sockets to demonstrate several PyModel techniques for modeling and testing systems that exhibit nondeterminism, concurrency, and asynchrony.
In this sample we model and test the Python standard library socket module. We don't expect any test failures! This sample includes a simulator that can replace the standard socket module with a substitute that can be configured to exhibit failures. (The simulator is more often used to demonstrate nondeterminism.)
Here we use concepts and vocabulary explained in concepts in the notes directory. Some files in this sample that are not discussed here are explained in socket_experiments.txt in the notes directory.
Socket behavior can be more complicated than expected, because data are buffered on each end, and may be transported through the network at an irregular rate. Nondeterminism arises from the unpredictable amount of data that may be sent or received in each call. Concurrency arises from the simultaneous, unsynchronized sending and receiving at the two ends of the connection. For more explanation of socket behavior, see:
http://docs.python.org/howto/sockets.html
http://pyvideo.org/video/885/through-the-ether-and-back-again-what-happens-to
This sample demonstrates how to use PyModel to model and test this behavior, including its nondeterminism and concurrency.
The Socket sample includes several modules. Here we discuss these:
-
msocket: a model program for network sockets that can exhibit nondeterminism and concurrency.
-
synchronous, deterministic: a scenario machine and a configuration file that remove the asynchronous, nondeterministic behaviors of the msocket model program.
-
stepper: a PyModel stepper (also called a test harness or adapter) that uses the msocket model program (and optionally, scenarios and configurations) to test actual sockets on localhost (via the Python standard library socket module). Also, stepper_util, code used by the stepper.
-
observables: a configuration file used with the stepper to indicate which actions in the msocket model program it should consider observable (the remaining actions are considered controllable).
-
test, test_viewer, test_stepper: test scripts that invoke various combinations of these modules.
-
socket_simulator and select_simulator: simulators that can optionally replace the Python standard library socket and select modules in the steppers. The simulators can be configured to exhibit errors, or to demonstrate a great deal of nondeterminism and concurrency, even with small messages.
-
fsmpy/ and svg/ directories that contain finite state machine modules and graphics files generated by the test scripts from the msocket model and its scenarios and configurations.
Some other modules that also appear in this sample are explained in notes/socket_experiments.txt.
The model program in the msocket module models both ends of the connection. Here we model a system where one end of the connection is always the sender and the other end is always the receiver: all send actions occur at one end and all recv actions occur at the other. Messages sent at one end can be eventually received at the other. This very simple system is all we need to experiment with concurrency and nondeterminism.
(This model program only models sending and receiving messages - it does not model establishing or closing the connections. To test a real implementation, those operations must all be handled in the stepper, see below.)
This model program uses split actions: it models each method call in the socket API with two actions. The socket send call appears in the model as by two separate functions send_call and send_return. Each function has arguments but no returned value. The returned value is represented as the argument of the return call. For example, we model the socket API call s.call('bb') that sends the message 'bb' and returns 2 (the number of characters that were sent) with this pair of actions in the model:
send_call('bb')
send_return(2)
Likewise, we model recv with by recv_call and recv_return. For example we model the socket API call r.recv(4) that returns 'bb' with this pair of actions (here the recv argument 4 is the maximum number of characters we will accept):
recv_call(4)
recv_return('bb')
Split actions make it possible to model concurrency and nondeterminism.
For sockets, concurrency arises from the simultaneous, unsynchronized sending and receiving at the two ends of the connection. PyModel represents concurrency by interleaving. Send and receive actions can interleave without synchronization at the two ends of the connection. Each end can send or receive while the other end is blocked.
Split actions enable us to model this. For example, we might see a trace with these actions:
recv_call(4)
send_call('bb')
send_return(2)
recv_return('bb')
At the beginning of this trace, nothing has been sent yet. The receiver calls recv and blocks, because there is nothing to receive - the buffers are empty. Then the sender sends a message and returns, so the receiver can read the message and return. Here the receiver is blocked from recv_call through recv_return, but while it is blocked the sender can still run.
We say this trace exhibits asynchronous behavior because recv_call is not immediately followed by recv_return - the send actions intervene (even though recv is synchronous from the point of view of the receiver, and send is synchronous from the point of view of the sender).
For sockets, nondeterminism arises from the unpredictable amount of data that may be sent or received in each call. The send operation may accept any (incomplete) segment from the beginning of its message input, and the recv operation may return any (incomplete) segment from the beginning of the accumulated message segments that have been accepted at the sender end. Multiple send and recv calls might be needed to transmit an entire message.
For example, we might see each of these traces. Here the entire message 'bb' is sent and received:
send_call('bb')
send_return(2)
recv_call(4)
recv_return('bb')
Here only the first character of the message is sent, and that character is received:
send_call('bb')
send_return(1)
recv_call(4)
recv_return('b')
Here the entire message is sent, but only the first character is received:
send_call('bb')
send_return(2)
recv_call(4)
recv_return('b')
In PyModel model programs, nondeterminism is implemented by enabling conditions. Each action (for example send_call) has a corresponding enabling condition (for example send_call_enabled), a boolean function of the action arguments and the model program state. When generating behavior, PyModel can call the action whenever the combination of arguments satisfies the enabling condition in the current state. When checking behavior, PyModel allows actions that satisfy their enabling conditions, but considers an action that violates its enabling condition to be a test failure.
The actual action arguments that are used are drawn from domains defined in the model program (or in a configuration module). To choose the next action (when generating behavior), PyModel considers all combinations from the domains (for all actions) and chooses a combination that satisfies the enabling condition. Often, several different actions (and argument combinations) are enabled in the same state, so the resulting behavior is nondeterministic.
Split actions enable us to model nondetermistic return values. The return actions in the model also have enabling conditions, and these can be coded to exhibit nondeterminism. In the msocket model program, the enabling condition for send_return is coded to allow PyModel to nondeterministically choose any value for n (the number of characters acceped by send) that is not larger than the length of msg (the argument of send). The enabling condition for recv_return allows any value for msg (the message returned by recv) that is an initial segment of the messages sent so far. The enabling conditions for the return actions are, in effect, the postconditions for those operations --- which can be nondeterministic, if the conditions are coded to allow that.
The msocket model program in this sample exhibits nondeterminism and asynchronous behavior. However, these are rarely seen in the common situation where a socket transmits small messages over a fast network -- usually the entire message is transmitted in a single pair of send, recv calls. Therefore, sometimes we wish to restrict the behavior of the model to deterministic, synchronous behaviors.
In PyModel, the behavior of a model can be restricted by composing the model with scenario machine, or by using a configuration module to limit domains. Here we use a scenario machine to remove concurrency, and a configuration module to remove nondeterminism.
The synchronous module is a scenario machine, an FSM. A graph of this scenario appears in Socket/svg/synchronous_graph.svg (which you can display in a browser). As the graph shows, in this scenario send_call is always immediately followed by send_return, and recv_call is always immediately followed by recv_return. Moreover, a send pair is always followed by a recv pair.
Composing a model program with a scenario has the effect of synchronizing them, so the model program can only execute sequences of actions permitted by the scenario. Here, composing the msocket model program with the synchronous scenario machine has the effect of merging the two parts of each split action into a single action.
Once send or recv is called, it always returns before another action can begin - now the behavior is synchronous. Moreover, in this scenario one send is always followed by one recv - the behavior is sequential.
In a model program, the enabling condition for an action might permit a large range of argument values, but the domain for that action can limit it to a few, or just one. Different domains can be written that select different collections of arguments. The deterministic module is a PyModel configuration file that redefines the domains of send_return and recv_return.
Now send_return always accepts the entire argument of send_call (not just an initial segment or prefix), and recv_return always produces the entire buffer contents (not just a prefix). This has the effect of deterministically transmitting the entire message from sender to receiver in a single pair of send/recv calls.
The test module generates some samples of behavior that demonstrate the msocket model program and the effects of the synchronous scenario and deterministic configuration.
The test module is a test script that contains test cases that run pmt, the PyModel tester. The pmt command generates traces (test runs), which are simply sequences of actions (function calls). The pmt command can optionally execute and check the test runs it generates, but here we just generate the traces to get samples of behavior to examine. For this, we do not need a stepper (test harness) because we do not need to connect the model to any socket implementation. Here we are not calling the actual socket library, we are only executing the model, as a kind of simulation.
To run the script and see the traces, type this command at the system shell prompt:
trun test
The first test case executes this command:
pmt -n 10 -c 6 msocket
(you can also type this command yourself). It generates a short (10 to 16 step) trace from the msocket module alone, so the trace may exhibit nondeterminism and asynchrony, with interleaving send and recv and incompletely sent messages. Here are the first few steps from one of these runs:
send_call('a',)
send_return(1,)
send_call('bb',)
recv_call(4,)
recv_return('a',)
send_return(1,)
The messages (arguments to send_call) are limited to 'a' and 'bb' by the domain in the msocket module (the comma after the argument in each call is just the way pmt prints it). We chose the very short message strings 'a' and 'bb' to be the only elements in the domain in order to limit the number of nondeterministic choices for message prefixes. This makes the graphs generated by the analysis (below) easier to understand.
This command chooses actions and their arguments at random so repeating the command generates different traces (to make runs repeatable, set the random seed with the -s option). To see longer traces, use a larger -n option. Omit the cleanup option -c to generate exactly -n steps, which may stop in a non-accepting state (with a call in progress or a message in flight).
The second test case executes:
pmt -n 10 -c 6 synchronous msocket
Including both the model program and a scenario machine as command arguments forms their composition. Now the run exhibits synchronous behavior: each call is immediately followed by its return. There is no more blocking, no more interleaving of send and receive. Moreover, send and receive alternate, as dictated by the scenario machine. But the behavior may still be nondeterministic - incomplete messages may be sent. Here are the first few steps from one of these runs:
send_call('bb',)
send_return(1,)
recv_call(4,)
recv_return('b',)
send_call('a',)
send_return(1,)
The third test case executes:
pmt -n 10 -c 6 deterministic synchronous msocket
This also includes the deterministic module, the configuration file that reassigns domains to ensure that the entire message is always sent and received immediately. Here are the first few steps from one of these runs:
send_call('a',)
send_return(1,)
recv_call(4,)
recv_return('a',)
send_call('bb',)
send_return(2,)
The test_viewer module generates graphs that illustrate the behavior of the msocket model program and the effects of the synchronous scenario and deterministic configuration. Here again, we are not calling the actual socket library, we are only executing the model. To generate the graphs, type this command:
trun test_viewer
The test_viewer module is a test script that runs pmv, the PyModel viewer. The pmv program in turn invokes pma, the PyModel analyzer, and pmg, the PyModel graphics program, and dot, a rendering program. The pma program performs an analysis called exploration which generates a finite state machine (FSM) from the model, and pmg generates a file of graphics commands in the dot language from the FSM. Finally, the dot program renders the output of pmg in a chosen graphics format. This script generates .svg files, which you can view in a browser. (Older versions of IE did not display .svg files - has MS fixed this?).
The test_viewer module generates four .svg files, which are in the Socket/svg directory. (The generated FSMs are modules in the Socket/fsmpy directory).
The graph in msocketFSM.svg is generated by the command:
pmv msocket
The .svg file shows a graph of an FSM generated by exploring the msocket model program. Every node in the graph represents a program state (hover the cursor over a node to show the state variables and their values). Every edge in the graph represents an action (every edge is labeled with the function call including its argument). Every path through this FSM describes a trace that the model can execute --- runs generated by the first test case in the test module are examples. Multiple edges emerging from a node indicate nondeterminism, and every call edge not immediately followed by a return edge indicates asynchrony.
The graph in synchronous_graph.svg is generated by the command:
pmv -o synchronous_graph synchronous
It is the graph of the synchronous scenario machine, an FSM.
The graph in synchronousFSM.svg is generated by the command:
pmv synchronous socket
It is a graph of the msocket model program composed with the synchronous scenario machine. Paths through this FSM include the runs generated by the second test case in the test module. Notice that every call is followed by return, and every send is followed by recv. But, there is still nondeterminism in action arguments.
The graph in deterministicFSM.svg is generated by the command:
pmv deterministic synchronous msocket
It is a graph of the msocket model program composed with the synchronous scenario machine, using the domains in the deterministic module.
A familiar approach to testing drives the module under test with a predefined sequence of function calls, then checks whether the module returns a particular set of expected results. This is called offline testing because the test sequence and the expected results are scripted in advance. (Most unit testing tools only support offline testing.)
Clearly, offline testing cannot work for nondeterministic systems. Instead, we use on-the-fly testing, where the function calls are generated and checked as the test executes. On-the-fly testing requires that we identify the controllable actions and observable actions.
Controllable actions can be invoked in the implementation by the PyModel test runner pmt, via a stepper (or test harness). In this socket sample, send_call and recv_call are the controllable actions.
Observable actions cannot be invoked by the test runner; they are generated by the socket implementation, captured by the stepper, and passed back to pmt. Here send_return and recv_return are the observable actions. The observable actions must be identified in the model program module or a configuration module; here they are indentified in the observables configuration module.
A PyModel stepper is a test harness (sometimes called an adapter) that connects a model (here, the msocket module) to an implementation (here, the Python standard library socket module). A stepper enables the PyModel tester pmt to use a model to call methods in the implementation and check the results.
The stepper module is our preferred stepper for the Socket sample. It drives both ends of the socket connection, so it can test both send and recv. And, it does this in a single ordinary process (Python program invocation) on the local machine. It accomplishes this by connecting to both ends of a socket on localhost and interleaving calls to send and recv.
Recall that the model program only models the socket API send and recv calls. To use a real socket, it is necessary to make additional socket API calls to establish and close a connection. These calls appear in the stepper_util module, which is imported by stepper.
The stepper module supports nondeterminism and asynchrony. It distinguishes between controllable and observable actions. Here is how the stepper executes and checks test runs. First we describe the simpler situation where the socket connection does not block (that is, where its buffers are ready). For this example we consider the controllable action recv_call (in the situation where there is a message already waiting in the recv buffer):
-
The test runner pmt calls the controllable action recv_call in the msocket model (which may update the model state)
-
pmt passes the controllable action recv_call to stepper
-
stepper calls the corresponding method s.recv in the implementation (the Python standard library socket module).
-
stepper collects the result (data) from the implementation and constructs the corresponding observable action recv_return where the result appears as an argument.
-
stepper appends the observable action recv_return, including its argument (the result) to the observation queue, where pmt can retrieve it.
-
pmt finds that recv_return observable action in the observation queue and removes it.
-
pmt checks the recv_return observable action (including the msg argument) by calling its enabling condition recv_return_enabled.
-
If the enabling condition returns False, the test fails.
-
If the enabling condition returns True, pmt executes the observable action recv_return in the model (possibly updating the model state) and the test continues.
-
If the test run reaches the end and none of the actions have failed, the test passes.
Asynchrony arises when the socket connection buffers are not ready, so attempting to read or write at the socket would block. (In this example, when there is no message waiting in the recv buffer.) This stepper handles this situation with the Python standard library select function.
The stepper always calls select before attempting to call send or recv on the socket. The return values from select indicate when the call would block; in those cases, the stepper does not call send or recv, so no blocking occurs. Instead, this stepper simply returns None to the caller pmt, to indicate that the test did not fail, but a blocking call is pending. This enables pmt to proceed, and continue executing the model and making calls to the stepper. Usually, one of these calls will eventually result in unblocking the socket. Then the select function can indicate that buffers are ready, and the pending send or recv can be called as described above. (Here pmt acts as an event loop, with the select in the stepper as the source of events.)
The controllable action appears in the trace as soon as pmt chooses it, but stepper keeps track of pending calls, and does not actually make the call in the implementation until select indicates the call will not block. (This sample also contains an alternative stepper, stepper_a, which uses Python threads instead of select to handle blocking, see socket_experiments.txt.)
Notice that this stepper does not determine whether each step in the test passes or fails, and it does not construct the message describing the failure. Instead, this stepper merely collects the response, constructs an observable action, and passes it back to pmt. It is the test runner pmt that determines whether the test passes or fails, and writes the message describing the failure.
This kind of stepper (which uses observable actions) cannot be driven from a offline test script which specifies every action in the test run, but can only be used for on-the-fly testing (where the implementation generates the observable actions).
This kind of stepper is recommended for systems that exhibit nondeterminism and concurrency. This style does require the model to be written with split actions, which might be a disadvantage for simple systems. This sample also includes steppers written in different styles, but they are not as versatile. (They are discussed in socket_experiments.txt.)
At last we reach testing. Here, finally, the stepper calls the actual Python standard library socket module. The test_stepper module is a test script. To execute it:
trun test_stepper
This test script contains a single command that invokes a run with the msocket model program driving the stepper module, using the observable actions identified in the observable module:
pmt -n 10 -c 6 msocket observables -i stepper
Here is an entire test run:
Server accepts connection from ('127.0.0.1', 58748)
send_call('bb',)
send_return(2,)
recv_call(4,)
recv_return('bb',)
recv_call(4,)
send_call('a',)
send_return(1,)
send_call('a',)
recv_return('a',)
send_return(1,)
recv_call(4,)
recv_return('a',)
0. Success at step 12, reached accepting state
We do not use the synchronous scenario here, so the pmt does not alternate calling send and recv; the trace contains some consecutive recv and send calls. Sometimes pmt invokes a blocking call: when recv_call is called the second time here, the socket is empty. But, thanks to the select call in the stepper, the test session can proceed and make a send_call on the other end of the socket.
However, we see here that nondeterminism is absent from the arguments of the return actions: send always sends the entire message and recv always receives it. The model does not dictate this, but here the implementation -- the standard library socket module -- behaves deterministically. This is typical when sending small messages over localhost.
The socket_simulator and select_simulator modules are replacements for the Python standard library socket and select modules to use with the PyModel socket sample, to demonstrate nondeterminism and blocking even with small messages. The simulated socket here may nondeterministically send or receive just part of the message, no matter how short the message (even just two characters).
To use this simulator, just put socket_simulator.py in the same directory with your PyModel socket steppers and rename it (or symlink it) to socket.py. You must also rename (or symlink) select_simulator.py to select.py. Then, you may execute any of the tests.
Here, with the simulator activated, we execute
trun test_stepper
Here the stepper is no longer calling the standard library socket module; it is calling our simulator instead. We see this test run:
Server accepts connection from nowhere.simulator.org
send_call('bb',)
send_return(1,)
send_call('bb',)
send_return(2,)
send_call('bb',)
recv_call(4,)
recv_return('b',)
recv_call(4,)
recv_return('b',)
send_return(1,)
recv_call(4,)
recv_return('bb',)
0. Success at step 12, reached accepting state
Notice the nondetermism. For example, the msg argument for the first send_call is 'bb', but send_return indicates that only one character was sent.
The socket_simulator can be configured by editing its source code. For example, you can edit the buffer size bufsize that determines when send must block. The default bufsize is only 3 characters -- notice the effect in the test run above, where send_call blocks (is not followed by send_return) where the buffer contains 3 characters.
You can also configure socket_simulator to simulate a faulty connection that garbles messages. Assign errors to a nonzero positive integer to cause some test failures. Each character in the received message will be replaced by 'X', with a probability of 1.0/errors (so smaller values for errors cause more frequent errors). Here is an example:
Server accepts connection from nowhere.simulator.org
send_call('bb',)
send_return(1,)
send_call('a',)
send_return(1,)
send_call('bb',)
send_return(1,)
recv_call(4,)
0. Failure at step 7, recv_return('baX',), action not enabled
We find errors = 5 causes a test failure in most (but not all) runs of test_stepper, so that is the default.