-
-
Notifications
You must be signed in to change notification settings - Fork 158
Explicit Framing Protocol Proposal
Key idea: allow Oil / OSH to manipulate chunks of arbitrary data via streams.
Internally, uses explicit framing format that can handle arbitrary bytes. Something similar to: https://tools.ietf.org/html/rfc6455#section-5.
Message (a.k.a. "Record", a.k.a. "Packet") is a variable-length chunk of arbitrary bytes. Can contain newlines, nulls, etc. Quoting is not required for data sent in framed packets. Escaping is done through filters.
Interface is via primitives:
-
put
: analog toecho
, but packs arguments into message. -
get
: analog toread
, but unpacks a single record. -
escape
: read messages from input, output quoted / escaped text. -
unescape
: lift escaped input into messages of raw strings.
- Safely converting between different formats, with different quoting rules.
- Allow OSH / Oil to route data coming from disparate sources, preserving original message boundaries.
- Possible strategy for / complement to Structured Data Over Pipes.
- Distributed computing.
- Message brokers / queues / event streams.
Note: by itself, not a format for structured data: just makes it easier and safer to manipulate streams of records from the shell. Allows handling quoting at pipeline endpoints, rather than needing to be managed at each stage by the programmer.
Safely output random 64-bit values.
rand64() {
while true; do
# bytes may contain embedded nulls, newlines, etc.
read -n 8 bytes < /dev/random
put "${bytes}"
done
}
rand64 | escape --python -d'\n'
In the above example, rand64
's stdout is framed. Escape must be used to obtain plain text. In this case, python escaping is used, delimiting records with newlines. Escape would default to some sensible shell-quoting dialect. Other formats might include:
- json
- c / c++
- tsv2
JSON is "plain text", msgpack is binary, but otherwise very similar to JSON. Both are "document-oriented": neither spec defines how to pack multiple documents into a bytestream. JSON can be comfortably new-line delimited -- IF the document is packed onto one line. With msgpack, one needs an explicit framing protocol. With framing, we can use existing tool to safely handle streams of complete documents.
Advantages to demonstrate:
- msgpack contains embedded nulls, but this is no problem for framed channels.
- input json documents can contain newlines! (i.e. can handle pretty-printed input!)
- can we use strings as byte buffers, or does there need to be a byte buffer type?
- I.e. What happens if an oil string contains an embedded null?
- Is it better if
get
put
work more likeread
, and interpret their argument as the name of a var?- i.e.
put foo > socket
vs .put "${foo}" > socket
.
- i.e.
- Would
get
\put
collide with any existing builtins / commands?- if so, perhaps could be re-cast as either flags for
read
,echo
,printf
, or as a "mode" on a file descriptor, accessible viadup
. Or some combination.
- if so, perhaps could be re-cast as either flags for