com.github.pkpkpk/fress {:mvn/version "0.4.312"}
[com.github.pkpkpk/fress "0.4.312"]
Read the introductory blog post
Try the quick-start
When used in concert with serde-fressian, fress can be used to convey rich values to and from WebAssembly modules. Serde-fressian is an implementation of fressian for the Rust programming language. When compiled for WebAssembly, the serde_fressian::wasm
module is designed to interface with the fress.wasm
namespace. Together they deliver a seamless transition from webassembly functions and their cljs consumers with minimal overhead. Fressian-wasm makes wasm functions feel like supercharged clojurescript functions.
There is a second companion library: cargo-cljs, a clojurescript library for scripting the rust build tool cargo via nodejs.
Please refer to the doc folder for wasm specific documentation.
The remainder of this readme below pertains to fressian for binary data unrelated to wasm usage. There are relevant details about the fressian encoding itself, but the api for reading and writing to wasm modules is completely different.
(everything below this line is for binary serialization usage only)
(require '[fress.api :as fress])
(def buf (fress/byte-stream))
(def writer (fress/create-writer buf))
(def data [{::sym 'foo/bar
:inst #?(:cljs (js/Date.)
:clj (java.util.Date.))
:set #{42 true nil "string"}}])
(fress/write-object writer data)
(def reader (fress/create-reader buf))
(assert (= data (fress/read-object reader)))
In javascript, binary data is kept in ArrayBuffers. TypedArrays are simply interpretive views on array buffer instances. Both are fixed size and there are no streaming constructs. If you want to 'grow' a typed array, you need to allocate a new buffer and copy over the old buffer's contents. Doing so on every write is prohibitively slow, so fress.api/byte-stream
addresses this by pushing bytes onto a plain javascript array which is realized into a byte-array only when deref'd or closed.
On the jvm, fress.api/byte-stream
is the BytesOutputStream provided by fressian extended with clojure.lang.IDeref
. Dereferencing returns a java.nio.ByteBuffer that realizes the current state of its stream.
fress.api/create-reader
will automatically coerce byte-streams into a readable buffer- jvm too
- useful for testing
- byte-streams are stateful, no wrap behavior in js
- If you deref a byte-stream and then continue to write, you will need to deref again to see the new bytes output in the buffer
fress.api/create-writer
also accepts any typed-array or arraybuffer instance- you are responsible for making sure you have enough room to write.
- in cljs, byte-streams can be recycled by calling
reset
When a reader reaches the end of its buffer, it will throw a (js/Error. "EOF")
(java.io.EOFException on JVM).
- nil is a value, so gotta throw!
fress.api/read-all
andfress.api/read-batch
will handle this for you.
By default, footers in cljs readers will automatically trigger an EOF throw, preventing oob reads when there is excess room remaining. The intended use case is receiving a pointer on memory and simply reading off fressian bytes until whichever comes first: a natural EOF or a footer. You can avoid the conundrum by always writing single collections, but that is not always possible or desirable.
-
fress.api/write<object, & opts> -> bytes
- takes any writable object (including a collection) and returns fressian bytes.
- accepts same args as
create-writer
but writer + buffer creation are done for you :footer? true
option to automatically seal bytes off with footer- convenient when you have all data you want to write ahead of time.
-
fress.api/read<readable, & opts> -> any
- takes bytes or bytestream and returns a single object read off the bytes
- accepts same args as
create-reader
but reader creation is done for you
-
fress.api/read-all<(readable|reader), & options> -> Vec<any>
- accepts reader, bytes, or bytestream, reads off everything it can. returns vector of contents
- accepts same options as reader
- automatically handles thrown EOFs for you
- Decide on a string tag name for your type, and the number of fields it contains
- define a write-handler, a
fn<writer, object>
- use
(fress/write-tag writer tag field-count)
- call
fress/write-object
on each component of your type
- create a writer and pass a
:handler
map oftype->handler
ortype->tag->server
Example: lets write a handler for javascript errors
(defn write-error [writer error]
(let [name (.-name error)
msg (.-message error)
stack (.-stack error)]
(fress/write-tag writer "js-error" 3) ;<-- don't forget field count!
(fress/write-object writer name) ;<= implicit ordering, hmmm...
(fress/write-object writer msg)
(fress/write-object writer stack)))
(def e (js/Error "wat"))
(def writer (fress/create-writer out))
(fress/write-object writer e) ;=> throws, no handler!
(def writer (fress/create-writer out :handlers {js/Error write-error}))
(fress/write-object writer e) ;=> OK!
-
Fress will automatically test if each written object is an instance of a registered type->write-handler pair. So write-error will also work for
js/TypeError
,js/SyntaxError
etc -
types that can share a writehandler but are not prototypically related can be made to share a write handler by passing them as seq in the handler entry key ie
(create-writer out :handlers {[typeA typeB] write-A-or-B})
So now let's try reading our custom type:
(def rdr (fress/create-reader out))
(def o (fress/read-object rdr))
(assert (instance? r/TaggedObject o))
So what happened? When the reader encounters a tag in the buffer, it looks for a registered read handler, and if it doesnt find one, its uses the field count to read off each component of the unidentified type and return them as a TaggedObject
. TaggedObject
s are generic containers for types a reader does not know how to handle. The field count is important because it lets consumers gracefully preserve the reading frame without forehand knowledge of whatever types you throw at it. Downstreams users do not have to care.
We can fix this by adding a read-error function:
(defn read-error [reader tag field-count]
(assert (= 3 field-count))
{:name (fress/read-object reader) ; :name was first, right?
:msg (fress/read-object reader)
:stack (fress/read-object reader)
:tag tag})
(def rdr (fress/create-reader out :handlers {"js-error" read-error}))
(fress/read-object reader) ;=> {:name "Error" :msg "wat" :stack ...}
Our write-error function chose to write each individual component sequentially, not as children of a parent list or, even better, as a map. This puts a burden on our read fn to both grab each individual field and know the right order of the components as they are read off. This will not be pleasant to maintain. A better solution would be to just write errors as maps and let fressian do the work for us.
(defn write-error [writer error]
(fress/write-tag writer "js-error" 1)
(fress/write-object writer
{:name (.-name error)
:msg (.-message error)
:stack (.-stack error)}))
(defn read-error [reader tag field-count]
(assoc (fress/read-object reader) :tag tag))
-
Fixed sized vectors, lists, and sequences are all written as generic lists and are read back as vectors.
-
Lists have three components: type, length, and body. Short lists have their length 'packed' with their type code. Longer lists are given a dedicated length segment. A list with length > 8 would be represented as:
LIST | length-n | item_0 | item_1 | ... | item_n-1
^---can be own multi-byte reading frame
- When you are in a situations where you have a sequence of indeterminant size or you need to write asynchronously, you can use
fress.api/begin-open-list
andfress.api/begin-closed-list
to establish a list reading frame. Rather than rely on a known length, readers will encounter a 'open' signal and then call read-object continuously until a END_COLLECTION is seen or EOF is thrown.
BEGIN_CLOSED_LIST | value | value | value | END_COLLECTION
-
The difference between
begin-closed-list
andbegin-open-list
is that EOF is an acceptable ending for an open list and will be handled for you. Closed lists expect a END_COLLECTION signal and will throw EOF as normal if encountered prematurely. -
Many structs are written as variants of 'list' with the differences being their tag and the way read handlers interpret their contents. For example:
- A set is simply a SET code followed by a list
- A map is a MAP code followed by list of k-v pairs. So the bytecode for a map with 3 entries could look something like:
MAP | BEGIN_CLOSED_LIST | k | v | k | v | k | v | END_COLLECTION
clojure.data.fressian can use defrecord constructors to produce symbolic tags (.. its class name) for serialization, and use those same symbolic tags to resolve constructors during deserialization. In cljs, symbols are munged in advanced builds, and we have no runtime resolve. How do we deal with this?
- When writing records, include a
:record->name
map at writer creation
- ex:
{RecordConstructor "app.core/RecordConstructor"}
- the string name should be the same as the string of the fully resolved symbol, and is used to generate a symbolic tag representing its className
- When reading records, include
:name->map-ctor
map at reader creation
- ex:
{"app.core/RecordConstructor" map->RecordConstructor}
- Why the record map constructor? Because clojure.data.fressian's default record writer writes record contents as maps
- if the name is not recognized, it will be read as a TaggedObject containing all the fields defined by the writer.
(require '[fress.api :as fress])
(defrecord SomeRecord [f1 f2]) ; map->SomeRecord is now implicitly defined
(def rec (SomeRecord. "field1" "field2"))
(def buf (fress/byte-stream))
(def writer (fress/create-writer buf :record->name {SomeRecord "myapp.core.SomeRecord"}))
(fress/write-object writer rec)
(def reader (fress/create-reader buf :name->map-ctor {"myapp.core.SomeRecord" map->SomeRecord}))
(assert (= rec (fress/read-object reader)))
- in clojurescript you can override the default record writer by adding a
"record"
entry in:handlers
. A built in use case for this isfress.api/field-caching-writer
which offers a way to automatically cache the value of keys that pass a predicate
(fress/create-writer buf :handlers {"record" (fress/field-caching-writer #{:f1})})
- on the jvm:
(let [cache-writer (fress/field-caching-writer #{:f1})]
(fress/create-writer buf :handlers
{clojure.lang.IRecord {"clojure/record" cache-writer}}))
Fressian compresses UTF-8 strings when writing them. This means a reader must decompress each char to reassemble the string. If payload size is your primary concern this is great, but if you want faster read+write times there is another option. The javascript TextEncoder / TextDecoder API has growing support (also see analog in node util module) and is written in native code. TextEncoder will convert a javascript string into plain utf-8 bytes, and the TextDecoder can reassemble a javascript string from raw bytes faster than javascript can assemble a string from compressed bytes.
By default fress writes strings using the default fressian compression. If you'd like to write raw UTF-8, you can use fress.api/write-utf8
on a string, or bind fress.writer/*write-raw-utf8*
to true
before writing. If you are targeting a jvm reader, you must also bind *write-utf8-tag*
to true
so the tag is picked up by the jvm reader. Otherwise a code is used that is only present in fress clients.
write-object
has a second arity that accepts a boolean cache?
parameter. The first time this is called on value, a 'cache-code' is assigned to that object which signals the reader to associated that code with the object. Subsequent writes of an identical object will just be written as that code and the reader will interpret it and return the same value.
- Readers can only interpret these cache codes in the context in which the were established. A naive reader who picks up reading bytes after a cache signal is sent will simpy return integers and not the appropriate value
- Writers can signal readers to reset their cache with a call to reset-caches. You are free to have multiple cache contexts within the same bytestream
Writers maintain a checksum of every byte written, and include this (with a byte-count) inside a footer. By default readers ignore this, but you can pass :checksum? true
when creating a reader to validate the checksum when a footer is read. An invalid checksum will throw.
Fress wraps clojure.data.fressian and can be used as a drop in replacement.
- read-handlers are automatically wrapped in fressian lookups; just pass a map of
{tag fn<rdr,tag,field-count>}
, same as you would for cljs - write-handlers are also automatically wrapped as lookups, but the shape for handler args is different! It must be
{type {tag fn<writer, obj>}
(fress/create-writer out :handlers {MyType {"mytype" (fn [writer obj] ...)}})
- if you are already reifying fressian read+writeHandlers, they will be passed through as is
- CLJS has no support for BigDecimal & Ratios at this time
fressian type | cljs-read | cljs-write | note |
---|---|---|---|
int | ✅ | ✅ | fress.reader/*throw-on-unsafe?* defaults to true |
bool | ✅ | ✅ | |
bytes | ✅ | ✅ | Int8Array |
float | ✅ | ✅ | |
double | ✅ | ✅ | |
string | ✅ | ✅ | |
null | ✅ | ✅ | |
list | ✅ | ✅ | |
boolean[] | ✅ | ✅ | use (write-as wrt "boolean[]" val) |
int[] | ✅ | ✅ | Int32Array |
long[] | ✅ | ✅ | BigInt64Array or (write-as wrt "long[]" val) |
float[] | ✅ | ✅ | Float32Array |
double[] | ✅ | ✅ | Float64Array |
Object[] | ✅ | ✅ | use (write-as wrt "Object[]" val) |
map | ✅ | ✅ | |
set | ✅ | ✅ | |
uuid | ✅ | ✅ | |
regex | ✅ | ✅ | |
inst | ✅ | ✅ | |
uri | ✅ | ✅ | goog.Uri |
records | ✅ | ✅ | see usage details in README |
bigint | ✅ | ✅ | |
sym | ✅ | ✅ | |
key | ✅ | ✅ | |
char | ✅ | ✅ | use (write-as wrt "char" \a) |
ratio | ❌ | ❌ | TODO |
bigdec | ❌ | ❌ | TODO |