Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Make Atom generic over the set of static strings #178

Merged
merged 19 commits into from
Nov 2, 2016
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions .travis.yml
Original file line number Diff line number Diff line change
Expand Up @@ -13,6 +13,7 @@ script:
- cargo test --features log-events
- "if [ $TRAVIS_RUST_VERSION = nightly ]; then cargo test --features unstable; fi"
- cargo test --features heapsize
- "cd string-cache-codegen/ && cargo build && cd .."
- "cd examples/event-log/ && cargo build && cd ../.."
- "cd examples/summarize-events/ && cargo build && cd ../.."
notifications:
Expand Down
25 changes: 12 additions & 13 deletions Cargo.toml
Original file line number Diff line number Diff line change
@@ -1,14 +1,20 @@
[package]

name = "string_cache"
version = "0.2.30"
version = "0.3.0" # Also update README.md when making a semver-breaking change
authors = [ "The Servo Project Developers" ]
description = "A string interning library for Rust, developed as part of the Servo project."
license = "MIT / Apache-2.0"
repository = "https://github.com/servo/string-cache"
documentation = "http://doc.servo.org/string_cache/"
documentation = "https://docs.rs/string_cache/"
build = "build.rs"

# Do not `exclude` ./string-cache-codegen because we want to include
# ./string-cache-codegen/shared.rs, and `include` is a pain to use
# (It has to be exhaustive.)
# This means that packages for this crate include some unused files,
# but they’re not too big so that shouldn’t be a problem.

[lib]
name = "string_cache"

Expand All @@ -26,21 +32,14 @@ heap_size = ["heapsize"]

[dependencies]
lazy_static = "0.2"
serde = ">=0.6, <0.9"
serde = "0.8"
phf_shared = "0.7.4"
debug_unreachable = "0.1.1"
rustc-serialize = { version = "0.3", optional = true }
heapsize = { version = "0.3", optional = true }

[dev-dependencies]
rand = "0.3"

[dependencies.rustc-serialize]
version = "0.3"
optional = true

[dependencies.heapsize]
version = ">=0.1.1, <0.4"
optional = true

[build-dependencies]
phf_generator = "0.7.4"
phf_shared = "0.7.4"
string_cache_codegen = { version = "0.3", path = "./string-cache-codegen" }
73 changes: 72 additions & 1 deletion README.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,6 +2,77 @@

[![Build Status](https://travis-ci.org/servo/string-cache.svg?branch=master)](https://travis-ci.org/servo/string-cache)

[Documentation](http://doc.servo.org/string_cache/)
[Documentation](https://docs.rs/string_cache/)

A string interning library for Rust, developed as part of the [Servo](https://github.com/servo/servo) project.

## Simple usage

In `Cargo.toml`:

```toml
[dependencies]
string_cache = "0.3"
```

In `lib.rs`:

```rust
extern crate string_cache;
use string_cache::DefaultAtom as Atom;
```

## With static atoms

In `Cargo.toml`:

```toml
[package]
build = "build.rs"

[dependencies]
string_cache = "0.3"

[build-dependencies]
string_cache_codegen = "0.3"
```

In `build.rs`:

```rust
extern crate string_cache_codegen;

use std::env;
use std::path::Path;

fn main() {
string_cache_codegen::AtomType::new("foo::FooAtom", "foo_atom!")
.atoms(&["foo", "bar"])
.write_to_file(&Path::new(&env::var("OUT_DIR").unwrap()).join("foo_atom.rs"))
.unwrap()
}
```

In `lib.rs`:

```rust
extern crate string_cache;

mod foo {
include!(concat!(env!("OUT_DIR"), "/foo_atom.rs"));
}
```

The generated code will define a `FooAtom` type and a `foo_atom!` macro.
The macro can be used in expression or patterns, with strings listed in `build.rs`.
For example:

```rust
fn compute_something(input: &foo::FooAtom) -> u32 {
match *input {
foo_atom!("foo") => 1,
foo_atom!("bar") => 2,
_ => 3,
}
}
```
74 changes: 7 additions & 67 deletions build.rs
Original file line number Diff line number Diff line change
@@ -1,73 +1,13 @@
extern crate phf_shared;
extern crate phf_generator;

#[path = "src/shared.rs"] #[allow(dead_code)] mod shared;
#[path = "src/static_atom_list.rs"] mod static_atom_list;
extern crate string_cache_codegen;

use std::env;
use std::fs::File;
use std::io::{BufWriter, Write};
use std::mem;
use std::path::Path;
use std::slice;

fn main() {
let hash_state = generate();
write_static_atom_set(&hash_state);
write_atom_macro(&hash_state);
}

fn generate() -> phf_generator::HashState {
let mut set = std::collections::HashSet::new();
for atom in static_atom_list::ATOMS {
if !set.insert(atom) {
panic!("duplicate static atom `{:?}`", atom);
}
}
phf_generator::generate_hash(static_atom_list::ATOMS)
}

fn write_static_atom_set(hash_state: &phf_generator::HashState) {
let path = Path::new(&std::env::var("OUT_DIR").unwrap()).join("static_atom_set.rs");
let mut file = BufWriter::new(File::create(&path).unwrap());
macro_rules! w {
($($arg: expr),+) => { (writeln!(&mut file, $($arg),+).unwrap()) }
}
w!("pub static STATIC_ATOM_SET: StaticAtomSet = StaticAtomSet {{");
w!(" key: {},", hash_state.key);
w!(" disps: &[");
for &(d1, d2) in &hash_state.disps {
w!(" ({}, {}),", d1, d2);
}
w!(" ],");
w!(" atoms: &[");
for &idx in &hash_state.map {
w!(" {:?},", static_atom_list::ATOMS[idx]);
}
w!(" ],");
w!("}};");
}

fn write_atom_macro(hash_state: &phf_generator::HashState) {
let set = shared::StaticAtomSet {
key: hash_state.key,
disps: leak(hash_state.disps.clone()),
atoms: leak(hash_state.map.iter().map(|&idx| static_atom_list::ATOMS[idx]).collect()),
};

let path = Path::new(&env::var("OUT_DIR").unwrap()).join("atom_macro.rs");
let mut file = BufWriter::new(File::create(&path).unwrap());
writeln!(file, r"#[macro_export]").unwrap();
writeln!(file, r"macro_rules! atom {{").unwrap();
for &s in set.iter() {
let data = shared::pack_static(set.get_index_or_hash(s).unwrap() as u32);
writeln!(file, r"({:?}) => {{ $crate::Atom {{ unsafe_data: 0x{:x} }} }};", s, data).unwrap();
}
writeln!(file, r"}}").unwrap();
}

fn leak<T>(v: Vec<T>) -> &'static [T] {
let slice = unsafe { slice::from_raw_parts(v.as_ptr(), v.len()) };
mem::forget(v);
slice
string_cache_codegen::AtomType::new("atom::tests::TestAtom", "test_atom!")
.atoms(&[
"a", "b", "address", "area", "body", "font-weight", "br", "html", "head", "id",
])
.write_to_file(&Path::new(&env::var("OUT_DIR").unwrap()).join("test_atom.rs"))
.unwrap()
}
2 changes: 1 addition & 1 deletion examples/event-log/src/main.rs
Original file line number Diff line number Diff line change
Expand Up @@ -9,7 +9,7 @@

extern crate string_cache;

use string_cache::Atom;
use string_cache::DefaultAtom as Atom;
use string_cache::event;

use std::io;
Expand Down
7 changes: 4 additions & 3 deletions examples/summarize-events/src/main.rs
Original file line number Diff line number Diff line change
Expand Up @@ -12,14 +12,15 @@ extern crate string_cache;
extern crate rustc_serialize;
extern crate phf_shared;

#[path = "../../../src/shared.rs"]
#[path = "../../../string-cache-codegen/shared.rs"]
#[allow(dead_code)]
mod shared;

use string_cache::Atom;
use string_cache::DefaultAtom as Atom;

use std::{env, cmp};
use std::collections::hash_map::{HashMap, Entry};
use std::marker::PhantomData;
use std::path::Path;

#[derive(RustcDecodable, Debug)]
Expand Down Expand Up @@ -88,7 +89,7 @@ fn main() {

// FIXME: We really shouldn't be allowed to do this. It's a memory-safety
// hazard; the field is only public for the atom!() macro.
_ => Atom { unsafe_data: ev.id }.to_string(),
_ => Atom { unsafe_data: ev.id, phantom: PhantomData }.to_string(),
};

match summary.entry(string) {
Expand Down
Loading