Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Supplant the Borrow bound on keys with a new AsKey trait #14

Draft
wants to merge 2 commits into
base: master
Choose a base branch
from
Draft
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
23 changes: 9 additions & 14 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -19,22 +19,21 @@ enable compilation of `Deserialize` and `Serialize` implementations for `Trie`.
## When should I use a QP-trie?

QP-tries as implemented in this crate are key-value maps for any keys which
implement `Borrow<[u8]>`. They are useful whenever you might need the same
operations as a `HashMap` or `BTreeMap`, but need either a bit more speed
(QP-tries are as fast or a bit faster as Rust's `HashMap` with the default
hasher) and/or the ability to efficiently query for sets of elements with a
given prefix.
implement `qp_trie::AsKey`, a specialized trait akin to `Borrow<[u8]>`. They
are useful whenever you might need the same operations as a `HashMap` or
`BTreeMap`, but need either a bit more speed (QP-tries are as fast or a bit
faster as Rust's `HashMap` with the default hasher) and/or the ability to
efficiently query for sets of elements with a given prefix.

QP-tries support efficient lookup/insertion/removal of individual elements,
lookup/removal of sets of values with keys with a given prefix.

## Examples

Keys can be any type which implements `Borrow<[u8]>`. Unfortunately at the
moment, this rules out `String` - while this trie can still be used to store
strings, it is necessary to manually convert them to byte slices and `Vec<u8>`s
for use as keys. Here's a naive, simple example of putting 9 2-element byte arrays
into the trie, and then removing all byte arrays which begin with "1":
Keys can be any type which implements `AsKey`. Currently, this means strings as
well as byte slices, vectors, and arrays. Here's a naive, simple example of
putting 9 2-element byte arrays into the trie, and then removing all byte
arrays which begin with "1":

```rust
use qp_trie::Trie;
Expand Down Expand Up @@ -135,10 +134,6 @@ test bench_trie_get ... bench: 40,898,914 ns/iter (+/- 13,400,062)
test bench_trie_insert ... bench: 50,966,392 ns/iter (+/- 18,077,240)
```

## Future work

- Add wrapper types for `String` and `str` to make working with strings easier.

## License

The `qp-trie-rs` crate is licensed under the MPL v2.0.
18 changes: 9 additions & 9 deletions src/entry.rs
Original file line number Diff line number Diff line change
@@ -1,13 +1,13 @@
use std::borrow::Borrow;
use std::marker::PhantomData;
use std::mem;

use unreachable::UncheckedOptionExt;

use key::AsKey;
use node::{Leaf, Node};
use util::nybble_get_mismatch;

pub fn make_entry<'a, K: 'a + Borrow<[u8]>, V: 'a>(
pub fn make_entry<'a, K: 'a + AsKey, V: 'a>(
key: K,
root: &'a mut Option<Node<K, V>>,
) -> Entry<'a, K, V> {
Expand All @@ -24,12 +24,12 @@ pub enum Entry<'a, K: 'a, V: 'a> {
Occupied(OccupiedEntry<'a, K, V>),
}

impl<'a, K: 'a + Borrow<[u8]>, V: 'a> Entry<'a, K, V> {
impl<'a, K: 'a + AsKey, V: 'a> Entry<'a, K, V> {
fn nonempty(key: K, root: &'a mut Option<Node<K, V>>) -> Entry<'a, K, V> {
let (exemplar_ptr, mismatch) = {
let node = unsafe { root.as_mut().unchecked_unwrap() };
let exemplar = node.get_exemplar_mut(key.borrow());
let mismatch = nybble_get_mismatch(exemplar.key_slice(), key.borrow());
let exemplar = node.get_exemplar_mut(key.as_nybbles());
let mismatch = nybble_get_mismatch(exemplar.key_slice(), key.as_nybbles());
(exemplar as *mut Leaf<K, V>, mismatch)
};

Expand Down Expand Up @@ -112,7 +112,7 @@ enum VacantEntryInner<'a, K: 'a, V: 'a> {
Internal(usize, u8, &'a mut Node<K, V>),
}

impl<'a, K: 'a + Borrow<[u8]>, V: 'a> VacantEntry<'a, K, V> {
impl<'a, K: 'a + AsKey, V: 'a> VacantEntry<'a, K, V> {
/// Get a reference to the key associated with this vacant entry.
pub fn key(&self) -> &K {
&self.key
Expand Down Expand Up @@ -151,7 +151,7 @@ pub struct OccupiedEntry<'a, K: 'a, V: 'a> {
root: *mut Option<Node<K, V>>,
}

impl<'a, K: 'a + Borrow<[u8]>, V: 'a> OccupiedEntry<'a, K, V> {
impl<'a, K: 'a + AsKey, V: 'a> OccupiedEntry<'a, K, V> {
/// Get a reference to the key of the entry.
pub fn key(&self) -> &K {
let leaf = unsafe { &*self.leaf };
Expand All @@ -167,15 +167,15 @@ impl<'a, K: 'a + Borrow<[u8]>, V: 'a> OccupiedEntry<'a, K, V> {
let leaf_opt = root.take();
let leaf = unsafe { leaf_opt.unchecked_unwrap().unwrap_leaf() };

debug_assert!(leaf.key_slice() == self.key().borrow());
debug_assert!(leaf.key_slice() == self.key().as_nybbles());
(leaf.key, leaf.val)
}

Some(Node::Branch(..)) => {
let branch_opt = root.as_mut();
let branch = unsafe { branch_opt.unchecked_unwrap() };

let leaf_opt = branch.remove_validated(self.key().borrow());
let leaf_opt = branch.remove_validated(self.key().as_nybbles());

debug_assert!(leaf_opt.is_some());
let leaf = unsafe { leaf_opt.unchecked_unwrap() };
Expand Down
143 changes: 143 additions & 0 deletions src/key.rs
Original file line number Diff line number Diff line change
@@ -0,0 +1,143 @@
use std::borrow::Borrow;
use std::borrow::Cow;

/// A trait for keys in a QP-trie.
///
/// Implementing types must be borrowable in the form of both a key slice,
/// such as `&str`, and the plain byte slice `&[u8]`. The former is used in
/// the public `trie::Trie` API, while the latter is used internally to match
/// and store keys.
///
/// Note that, as a consequence, keys which are not bytewise-equivalent will
/// not associate to the same entry, even if they are equal under `Eq`.
pub trait AsKey {
/// The borrowed form of this key type.
type Borrowed: ?Sized;

/// View the key slice as a plain byte sequence.
fn nybbles_from(key: &Self::Borrowed) -> &[u8];

/// Borrow the key as nybbles, in the form of a plain byte sequence.
fn as_nybbles(&self) -> &[u8];
}

macro_rules! impl_for_borrowables {
( $type:ty, $life:lifetime; $borrowed:ty; $view:ident ) => {
impl<$life> AsKey for &$life $type {
type Borrowed = $borrowed;

#[inline]
fn as_nybbles(&self) -> &[u8] {
self.$view()
}

#[inline]
fn nybbles_from(key: &Self::Borrowed) -> &[u8] {
key.$view()
}
}
};
( $type:ty; $borrowed:ty; $view:ident ) => {
impl AsKey for $type {
type Borrowed = $borrowed;

#[inline]
fn as_nybbles(&self) -> &[u8] {
self.$view()
}

#[inline]
fn nybbles_from(key: &Self::Borrowed) -> &[u8] {
key.$view()
}
}
}
}

impl_for_borrowables! { [u8], 'a; [u8]; as_ref }
impl_for_borrowables! { Vec<u8>; [u8]; as_ref }
impl_for_borrowables! { Cow<'a, [u8]>, 'a; [u8]; as_ref }

impl_for_borrowables! { str, 'a; str; as_bytes }
impl_for_borrowables! { String; str; as_bytes }
impl_for_borrowables! { Cow<'a, str>, 'a; str; as_bytes }

macro_rules! impl_for_arrays_of_size {
($($length:expr)+) => { $(
impl AsKey for [u8; $length] {
type Borrowed = [u8];

#[inline]
fn as_nybbles(&self) -> &[u8] {
self.as_ref()
}

#[inline]
fn nybbles_from(key: &Self::Borrowed) -> &[u8] {
key
}
}
)+ }
}

impl_for_arrays_of_size! {
0 1 2 3 4 5 6 7 8 9
10 11 12 13 14 15 16 17 18 19
20 21 22 23 24 25 26 27 28 29
30 31 32
}

/// Break!
pub trait Break<K: ?Sized>: AsKey {
fn empty<'a>() -> &'a K;
fn find_break(&self, loc: usize) -> &K;
fn whole(&self) -> &K;
}

// All `AsKey`s can break as [u8], by construction of the qp-trie.
impl<'b, K> Break<[u8]> for K
where
K: AsKey,
K::Borrowed: Borrow<[u8]>,
{
#[inline]
fn empty<'a>() -> &'a [u8] {
<&'a [u8]>::default()
}

#[inline]
fn whole(&self) -> &[u8] {
self.as_nybbles()
}

#[inline]
fn find_break(&self, loc: usize) -> &[u8] {
&self.as_nybbles()[..loc]
}
}

impl<'b, K> Break<str> for K
where
K: AsRef<str> + AsKey,
K::Borrowed: Borrow<str>,
{
#[inline]
fn empty<'a>() -> &'a str {
<&'a str>::default()
}

#[inline]
fn whole(&self) -> &str {
self.as_ref()
}

#[inline]
fn find_break(&self, mut loc: usize) -> &str {
let s: &str = self.as_ref();
while !s.is_char_boundary(loc) {
loc -= 1;
}

&s[..loc]
}
}
4 changes: 3 additions & 1 deletion src/lib.rs
Original file line number Diff line number Diff line change
Expand Up @@ -15,6 +15,7 @@ mod serialization;

mod entry;
mod iter;
mod key;
mod node;
mod sparse;
mod subtrie;
Expand All @@ -25,5 +26,6 @@ pub mod wrapper;

pub use entry::{Entry, OccupiedEntry, VacantEntry};
pub use iter::{IntoIter, Iter, IterMut};
pub use key::{AsKey, Break};
pub use subtrie::SubTrie;
pub use trie::{Break, Trie};
pub use trie::Trie;
Loading