Skip to content
This repository has been archived by the owner on Sep 12, 2018. It is now read-only.

Optimize NamespacedKeyword (and NamespacedSymbol) #648

Closed
wants to merge 3 commits into from
Closed

Optimize NamespacedKeyword (and NamespacedSymbol) #648

wants to merge 3 commits into from

Conversation

thomcc
Copy link
Contributor

@thomcc thomcc commented Apr 20, 2018

Currently NamespacedKeyword (and NamespacedSymbol) is effectively a tuple of two Strings. This means that each instance of it will require two allocations, reading the data out of them will typically require two cache misses, etc. There are a number of ways this could be improved.

One way is interning, and the comments indicate in several places that we'd like should to doing this, however, the way Mentat is currently structured, I think this is not trivial1.

Another is just to store it in a single string, and remember the boundaries. That's what these patches do.

Aside from some additional code complexity (which is localized to the module implementing this optimization), there's no real downside here, and the upside is you'll have better cache locality for code that touches both the name and namespace, fewer allocations for most code, reduced memory usage for objects holding NamespacedKeywords (e.g. std::mem::size_of::<NamespacedKeyword>() is lower), cheaper creation of these keywords, and etc.

In theory at least, to be sure about that benchmarks are required, so the first patch fixes some issues in the tx-parser benchmark, and makes it parse the transactions with the seattle data from our fixtures. While using the tx-parser to benchmark this is not great representation of mentat as a whole2, it is probably reasonably accurate for a benchmark of code that uses keywords heavily, and (most importantly) it was already there.

Anyway, there's a little bit of unsafe inside NamespacedName now, but the code is still totally safe. There are two variants we need to maintain for safety, and both are maintained by NamespacedName::new, and reflected in comments for the boundary property.

One note: Do we intend to support the serde_support feature (in the edn crate)? It's not used on most types, and was certainly the hardest part of this patch to get working properly (note: using derive(Deserialize) on a type that requires that its fields maintain certain invariants for the sake of memory safety is a very bad idea). Regardless, this does work now, is safe, and has tests...

Benchmarks

Anyway, the end result of this for me is quite a decent speedup on the tx-parser benchmark:

Before (no optimization -- it does have the first patch which fixes and extends the benchmarks though):

test bench_parse1 ... bench:       2,856 ns/iter (+/- 580)
test bench_parse2 ... bench:      51,548 ns/iter (+/- 8,617)
test bench_parse3 ... bench:     109,473 ns/iter (+/- 50,330)
test bench_parse4 ... bench:  11,773,658 ns/iter (+/- 1,575,618)

After (with optimization):

test bench_parse1 ... bench:       2,392 ns/iter (+/- 901)
test bench_parse2 ... bench:      44,803 ns/iter (+/- 16,559)
test bench_parse3 ... bench:      97,035 ns/iter (+/- 52,726)
test bench_parse4 ... bench:   9,439,930 ns/iter (+/- 1,370,286)

This varied across runs but these were relatively representative for me.

It would be nice to know how much that helps in more general cases, but it's actually better than I had expected. Of course, real workloads are going to be dominated by sqlite time, so this somewhat silly to work on (I actually waffled about submitting this PR for this reason, but it seems better to have than not), but it's bugged me for a while, and the kinds of things I work on during the day don't give me many opportunities to micro-optimize any more.

Anyway, you can try yourself with cargo +nightly bench -p mentat_tx_parser if you have a nightly build installed in rustup and feel like it. (Sadly, cargo bench is going away soon, but supposedly something similar will replace it, and not require nightly).


1 In particular, mentat treats keywords as value types and creates and extracts them all over the place. In the places we do intern things, we use reference counting, mostly to make the lifetimes bearable (I can think of ways to make this work with references, but none of them are as simple as Rc). Sadly, this is hardly ideal, since Rc<String> has a very real cost when compared with &str.

Rustc uses both references and numeric handles to interned strings depending on the case, but it's worth noting that the lifetime there is a lie, and could result in UAF if used wrong (given that Mentat connections might be long lived, the "leak everything until task cleanup" strategy seems dodgy to me, unless we spawned a task for each query/transaction, which... dunno, might work?). Anyway, nothing prevents us from doing both, and the current representation seemed needlessly expensive to me.

2 In general, it would be really good to get more benchmarks (both ones that use sqlite and ones that don't). There are certainly comments that say things about performance that are dubious, and not having numbers to back this up doesn't help... Also, things like using Rc to avoid copy is not always a win, especially when the type it holds onto is small -- most keywords and symbols are quite short, and they are Rced in several places -- this might be the right choice, but it also could be bad for a lot of reasons. Measuring would remove doubt here.

Copy link
Collaborator

@rnewman rnewman left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please split this PR into just the NamespacedKeyword part and the serde part. I'm not sure if there's any value in landing the latter just yet.

ser::{Serialize, Serializer}
};

// Data storage for both NamespacedKeyword and NamespacedSymbol. This gets it's own module because
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

its (on this line and the one below)

@@ -0,0 +1,220 @@
// Copyright 2016 Mozilla
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

2018

#[inline]
pub fn namespace(&self) -> &str {
unsafe {
self.ns_and_name.slice_unchecked(0, self.boundary)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think it's necessary to use slice_unchecked here, at least without some profiling to suggest that it's worthwhile. This is the first non-FFI unsafe code in Mentat, and I don't think that's worth it.


#[inline]
pub fn components<'a>(&'a self) -> (&'a str, &'a str) {
(self.namespace(), self.name())
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

self.ns_and_name.split_at(self.boundary)

}
}

// We can't derive these, since the derived version doesn't know how to interepret `ns_and_name`.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

interpret.

But a more useful comment would be that we order by namespace then by name.

@@ -214,15 +221,12 @@ impl NamespacedKeyword {
/// ```
pub fn to_reversed(&self) -> NamespacedKeyword {
let name = if self.is_backward() {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This function deserves rewriting now:

let name = self.name();
if name.starts_with('_') {
    name[1..].to_string()
} else {
    format!("_{}", name)
}

name: name,
namespace: self.namespace.clone(),
}
NamespacedKeyword::new(self.namespace(), &name)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reversing now allocates for the name twice, which is a little redundant.

@@ -0,0 +1,45 @@

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

License block.

@rnewman rnewman mentioned this pull request May 8, 2018
@rnewman
Copy link
Collaborator

rnewman commented May 10, 2018

I’m wrapping this up and taking it further to support aliasing in pull.

@rnewman rnewman self-assigned this May 10, 2018
@ncalexan
Copy link
Member

I’m wrapping this up and taking it further to support aliasing in pull.

I'm glad to see this landing. Could this help when consuming StructuredMap? Right now the keys are ValueRc<NamespacedKeyword>, which is frustrating to create.

@rnewman
Copy link
Collaborator

rnewman commented May 12, 2018

Obsoleted by #682.

@rnewman rnewman closed this May 12, 2018
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants