Skip to content

Latest commit

 

History

History
664 lines (474 loc) · 37.1 KB

2018-05-19.md

File metadata and controls

664 lines (474 loc) · 37.1 KB

A relatively simple Datalog engine in Rust

In this post we will build up a relatively concise Datalog engine: DataFrog.

I initially wrote "simple", and I think that is nearly the case, but I'll have to let you be the judge. One of the design goals is that simple or not, there should be little enough in the way of someone coming to understand how it all works, with enough time. No complicated runtime, or weird abstractions and such. Or, less weird, anyhow.

Acks

This project got put together rather suddenly, in response to some work the Rust folks are doing on their new and improved borrow checker. They had been using differential dataflow, which was super brave of them, but it seemed like a more stream-lined Datalog implementation might suit them better for now.

Thanks go out to @lqd, @nikomatsakis, and @qmx for moving things along (and testing, and listening, and explaining how it could be better, etc).

Datalog

Datalog is a recently re-popularized language from decades backs, in which you start with a few collections of "tuples", often called "facts" in Datalog, and repeatedly apply a set of rules that derive new tuples from existing tuples. These rules all have a restricted form, where you can insist that some tuple values in different collections be the same, and only when they match do they produce a new tuple.

For example, perhaps we have two collections nodes and edges, where nodes contains usize tuples and edges contains (usize, usize) tuples. We might write a rule

nodes(y) <- nodes(x), edges(x,y)

which says that whenever we have an element of nodes and elements of edges that match in their first coordinate, we can add the second coordinate to nodes.

Relations

Our engine is going to start with a definition of a Relation<Tuple>, which will be a sorted list of distinct elements of type Tuple.

/// A sorted list of distinct tuples.
pub struct Relation<Tuple: Ord> {
    elements: Vec<Tuple>
}

This type is not much different than a Vec<Tuple>, which is just a list of Tuple elements. The main difference is that it will guarantee that its elements are sorted and distinct. We could ensure this by providing a very limited constructor:

impl<Tuple: Ord> From<Vec<Tuple>> for Relation<Tuple> {
    fn from(mut elements: Vec<Tuple>) -> Self {
        elements.sort_unstable();
        elements.dedup();
        Relation { elements }
    }
}

This explains how to create a relation from a Vec<Tuple>, which is essentially the same thing minus the guarantee about ordering and distinctness. As you can see, we perform the sorting and deduplication. Fortunately, if the input is already sorted and deduplicated this is relatively cheap.

In fact we will generalize this a bit, so that we can take in input types other than just Vec<Tuple>. We should be able to take anything that be enumerated as a list of Tuple elements, which we can express using the IntoIterator trait as a constraint:

impl<Tuple: Ord, I: IntoIterator<Item=Tuple>> From<I> for Relation<Tuple> {
    fn from(iterator: I) -> Self {
        let mut elements: Vec<Tuple> = iterator.into_iter().collect();
        elements.sort_unstable();
        elements.dedup();
        Relation { elements }
    }
}

This implementation will allow us to use not only vectors, but also other iterators that might transform or restrict a set of tuples.

In the interest of completeness, we will have one more method for the Relation type, a method merge that consumes two relations and produces their union:

impl<Tuple: Ord> Relation<Tuple> {
    /// Merges two relations into their union.
    pub fn merge(self, other: Self) -> Self {
        let mut elements = Vec::with_capacity(self.elements.len() + other.elements.len());
        elements.extend(self.elements.into_iter());
        elements.extend(other.elements.into_iter());
        elements.into()
    }
}

The into() method is provided by Rust for any type implementing From (as defined just above). This fairly primitive method just concatenates the two relations and the sorts and deduplicates them (in from(), by way of into()).

Variables

While the Relation type may be great for static sets of tuples, in Datalog we expect our relations to grow. As we apply rules we must add more facts, and then these new facts may lead to even more facts, until we eventually reach some limit (one always reaches a limit in Datalog, by virtue of not being able to create new values for tuple coordinates).

Our framework is going to repeatedly apply a set of rules to our growing relations, each of whose tuples are either "recent" meaning new as of the most recent application of rules, or "stable" meaning they have been present for multiple rounds rule applications (and the rules have had a chance to react to their presence).

To accommodate growing relations, we introduce a new type Variable<Tuple>, which is similar in spirit to a Relation<Tuple> except that we pay some attention to which Tuple elements are brand new and which are not.

pub struct Variable<Tuple: Ord> {
    /// A list of already processed tuples.
    stable: Vec<Relation<Tuple>>,
    /// A list of recently added but unprocessed tuples.
    recent: Relation<Tuple>,
    /// A list of tuples yet to be introduced.
    to_add: Vec<Relation<Tuple>>,
}

In this type tuples can be found in one of three places.

  1. A new tuple is first added to the self.to_add list, which indicates that we think the tuple should be in the relation that the variable represents. This list is mostly a "to-do" list, and we aren't going to look at its contents other than to add tuples.

  2. Tuples in the to_add list are eventually promoted to the self.recent relation, if it turns out that they are novel (they haven't already been added to the variable). The recent relation contains those tuples that have just been admitted, and should be re-considered by each of the rules that we might apply.

  3. Once they have been considered by all rules, recent tuples are moved into self.stable, which contains the full list of processed tuples. As we will see, some rules will want access to even old tuples, and so we will want to keep them around, even if they are not driving new rule applications.

Our rules will be able to look at a Variable and see which tuples are new and should be reacted to (self.recent), if needed consult tuples that already exist (self.stable), and totally ignore tuples that may be added in the next iteration (self.to_add).

You might be wondering why stable is a list of relations rather than just one, and it has to do with how we will efficiently maintain the contents. We want the flexibility to build stable out of several lists of geometrically varying size, so that when we add a single tuple we need not rebuild the entire list.

Rules

What do rules look like? We had an example up above,

nodes(y) <- nodes(x), edges(x,y)

which looks for matches in nodes and edges and adds them to nodes. We could naively implement this by looking at all of nodes and edges each iteration, both stable and recent, and adding every matching result we find to nodes. That would be a lot of work, and fortunately there are simpler ways to accomplish the same thing.

In each round, we assume that we have already added to the output any tuples that derive from stable tuples in the input variables. This means that we only need to look at matches that involve at least one recent tuple. In the case up above with just two relations, that means we need to look for matches between:

  • nodes.recent and edges.stable, and
  • nodes.stable and edges.recent, and
  • nodes.recent and edges.recent.

If we can come up with an efficient way to find matches between two input Relation types, we can just use that three times with the associated classes of tuple.

Helpers

Let's develop a helpful method that we will use to join together relations, be they recent or stable. We can't process any two relations, because they need to have a certain structure to find common keys. We will insist that their Tuple types have the form (Key, Val1) and (Key, Val2), just by defining an implementation on relations of that type (meaning it is applicable to such pair tuple types, and not otherwise).

We also aren't exactly sure what to do with matches yet, so let's leave that open-ended. We will also ask for a function that we can apply to matching Key, Val1, and Val2 triples, and we will plan to apply that function to each match rather than populate some output list or anything like that. As we will see, we can stash the output in a list using the function, and be a bit more flexible at the same time.

Here is a prototype for the method:

fn join_helper<Key: Ord, Val1, Val2>(
    input1: &Relation<(Key,Val1)>,
    input2: &Relation<(Key,Val2)>,
    mut result: impl FnMut(&Key, &Val1, &Val2)) {

    // do some stuff probably.

}

What will we do in such a method? Well, we know that each relation is sorted, so we really just need to do a merge between the two relations. We swing through each of them looking for matching keys, and when the keys don't match we advance the relation with the smaller key.

To march through the relations, we will recast them as "slices", which is how Rust calls a contiguous hunk of valid typed memory. In other languages we might use indices, or just raw pointers, but this approach is one way to convince Rust's type system that all of our data is valid; we will not be dereferencing random garbage memory.

    // represent the relations as slices.
    let mut slice1 = &input1.elements[..];
    let mut slice2 = &input2.elements[..];

    while !slice1.is_empty() && !slice2.is_empty() {

        use std::cmp::Ordering;

        // If the keys match call `result`, else advance the smaller key until they might.
        match slice1[0].0.cmp(&slice2[0].0) {
            Ordering::Less => {
                slice1 = gallop(slice1, |x| x.0 < slice2[0].0);
            },
            Ordering::Equal => {

                // Determine the number of matching keys in each slice.
                let count1 = slice1.iter().take_while(|x| x.0 == slice1[0].0).count();
                let count2 = slice2.iter().take_while(|x| x.0 == slice2[0].0).count();

                // Produce results from the cross-product of matches.
                for index1 in 0 .. count1 {
                    for index2 in 0 .. count2 {
                        result(&slice1[0].0, &slice1[index1].1, &slice2[index2].1);
                    }
                }

                // Advance slices past this key.
                slice1 = &slice1[count1..];
                slice2 = &slice2[count2..];
            }
            Ordering::Greater => {
                slice2 = gallop(slice2, |x| x.0 < slice1[0].0);
            }
        }
    }

I think a lot of this makes sense, except for the part where I call gallop. You probably don't know what that method does, but roughly it slides forward through its first argument until the second argument (a predicate) is no longer true. Instead of doing this one element at a time, it repeatedly doubles its step size until it finds a violation, and then repeatedly halves the step size until it is right up against the limit. For completeness, it looks like this:

pub fn gallop<T>(mut slice: &[T], mut cmp: impl FnMut(&T)->bool) -> &[T] {
    // if empty slice, or already >= element, return
    if slice.len() > 0 && cmp(&slice[0]) {
        let mut step = 1;
        while step < slice.len() && cmp(&slice[step]) {
            slice = &slice[step..];
            step = step << 1;
        }

        step = step >> 1;
        while step > 0 {
            if step < slice.len() && cmp(&slice[step]) {
                slice = &slice[step..];
            }
            step = step >> 1;
        }

        slice = &slice[1..]; // advance one, as we always stayed < value
    }

    return slice;
}

Rules (redux)

Now that we have a helper in hand, what might a rule look like?

Recall that we had those three cases we needed to process,

  • nodes.recent and edges.stable, and
  • nodes.stable and edges.recent, and
  • nodes.recent and edges.recent.

Let's write a method that takes two input variables and adds new results to an output variable. Rather than require that the output variable contain tuples exactly matching the key and values of the inputs, we will let the user supply a transformation from those three types to whatever tuple type it manages.

Remember that stable contains a list of relations, so we need to swing through each of them.

pub fn join_into<Key: Ord, Val1: Ord, Val2: Ord, Result: Ord>(
    input1: &Variable<(Key, Val1)>,
    input2: &Variable<(Key, Val2)>,
    output: &mut Variable<Result>,
    mut logic: impl FnMut(&Key, &Val1, &Val2)->Result) {

    let mut results = Vec::new();

    // input1.recent and input2.stable.
    for batch2 in input2.stable.iter() {
        join_helper(&input1.recent, &batch2, |k,v1,v2| results.push(logic(k,v1,v2)));
    }

    // input1.stable and input2.recent.
    for batch1 in input1.stable.iter() {
        join_helper(&batch1, &input2.recent, |k,v1,v2| results.push(logic(k,v1,v2)));
    }

    // input1.recent and input2.recent.
    join_helper(&input1.recent, &input2.recent, |k,v1,v2| results.push(logic(k,v1,v2)));

    output.insert(results.into());
}

That's it for the join_into rule! If we apply that over and over and over, we will populate output with all of the results of matching up input1 and input2 on their shared key, allowing output to pick out whichever tuple fields it is most interested in keeping.

Distinctness

There is one last really important thing to do: advance the tuples in each Variable.

Remember that we have to_add, recent, and stable, and we want to migrate recent into stable and then to_add into recent. This is also a great time to check if recent is non-empty, because we will probably want to stop iterating once all of our variables have empty recent relations (because there should be nothing new to do).

Our method to do this is a bit scary, so we will break it down into two parts: first migrate recent into stable, and second migrate to_add into recent. The overall signature of the method we are about to write is:

impl<Tuple: Ord> Variable<Tuple> {
    fn changed(&mut self) -> bool {

        // 1. Merge self.recent into self.stable.
        unimplemented!()

        // 2. Move self.to_add into self.recent.
        unimplemented!()

        // Return true iff recent is non-empty.
        !self.recent.is_empty()
    }

Let's start with the first step: merging self.recent and self.stable. The plan here is that we are maintaining a list of relations, each of which are distinct from each other. We will also make sure that recent is distinct from everything in stable (in just a moment), so all we really need to do is add recent to the stable list.

Except, if we just add to the stable list, we may end up with a long list of small relations, and that is a pain. Instead, we will merge stable relations that have the same size, ensuring that no two are within a factor of two of each other. This ensures we have at most a number of relations logarithmic in the total number of tuples, and that we perform an amortized bounded amount of work for each tuple we add (again, logarithmic in the total number of tuples).

        // 1. Merge self.recent into self.stable.
        if !self.recent.is_empty() {
            let mut recent = ::std::mem::replace(&mut self.recent, Vec::new().into());
            while self.stable.last().map(|x| x.len() <= 2 * recent.len()) == Some(true) {
                let last = self.stable.pop().unwrap();
                recent = recent.merge(last);
            }
            self.stable.push(recent);
        }

This isn't the very best implementation possible, but it works great for now. We end up with an empty self.recent, and a self.stable which has only so many relations in it.

The next step is to turn self.to_add into the new self.recent, where we also want to remove any elements that are already present in our updated self.stable. This removal is what ensures we eventually converge: if we eventually add no new tuples, then our next self.recent will be empty and we can finally rest.

        // 2. Move self.to_add into self.recent.
        if let Some(mut to_add) = self.to_add.pop() {
            // 2a. Merge all newly added relations.
            while let Some(to_add_more) = self.to_add.pop() {
                to_add = to_add.merge(to_add_more);
            }
            // 2b. Restrict `to_add` to tuples not in `self.stable`.
            for batch in self.stable.iter() {
                let mut slice = &batch[..];
                to_add.elements.retain(|x| {
                    slice = join::gallop(slice, |y| y < x);
                    slice.len() == 0 || &slice[0] != x
                });
            }
            self.recent = to_add;
        }

The two steps here are first to coalesce all of the newly added tuples into a Relation, which will be sorted and distinct, and then to swing through each relation in self.stable and retaining new elements only when they cannot be found in the stable relation. We use that neat gallop thing again.

We've now completed the changed() method, which tells us whether there have been any changes in the variable since last we look, and which maintains all of the data internally in a reasonable representation.

That's just about all of the engine!

Caveats

Ok, actually there are a bunch of other things. For various ergonomic reasons, several fields are wrapped in Rc<RefCell<_>> layers, which provide run-time ownership tests. Roughly, we have users create an instance of a Iteration type, from which they create variables. This iteration thing wraps all of the changed() logic together and tries to prevent you from accidentally not stepping your variables along. Unfortunately this shares some ownership around, and means that we need reference counted ref-cells.

There are also several places that I lied about the implementations, in that I actually wrote something a bit more complicated in the interest of performance (e.g. Relation::merge()). These do make the code work a bit better, but you should get just fine performance without them (a few more copies in the case of merge()).

You can check out the DataFrog repository to see the real story, but the presented structure is roughly accurate, I think.

An example

Let's write an example!

This example is from our nodes and edges example up above, but actually applied to a real problem. At least, there was a paper published in ASPLOS 2017 that used this as one of their benchmark computations. We are going to reproduce their results!

extern crate datafrog;
use datafrog::Iteration;

fn main() {

    let timer = ::std::time::Instant::now();

    // ELIDED: load `nodes` and `edges` here.

    println!("{:?}\tData loaded", timer.elapsed());

    // Create a new iteration context, ...
    let mut iteration = Iteration::new();

    // .. some variables, ..
    let variable1 = iteration.variable::<(u32,u32)>("nodes");
    let variable2 = iteration.variable::<(u32,u32)>("edges");

    // .. load them with some initial values, ..
    variable1.insert(nodes.into());
    variable2.insert(edges.into());

    // .. and then start iterating rules!
    while iteration.changed() {
        // N(c,a) <-  N(b,a), E(b,c)
        variable1.from_join(&variable1, &variable2, |_b, &a, &c| (c,a));
    }

    let reachable = variable1.complete();
    println!("{:?}\tComputation complete (nodes_final: {})", timer.elapsed(), reachable.len());
}

Woo! That is a pretty cool program. I have some code up at the top that I've elided that reads the initial collections in from disk. We still time all of that, even though most of it is silliness like parsing text.

It's worth pointing out that the rule as described is a bit weird. The variable1 variable actually tracks the transpose of the N relation, which is why a join between variable1 and variable2 finds a common value b rather than matching up a and b, which .. isn't what we want. It occasionally takes some head-scratching to make sure you have the right query.

So, does it perform well?

Echidnatron% cargo run --release --bin graspan1 -- ~/Desktop/graspan/httpd_df.dms
   Compiling datafrog v0.1.0 (file:///Users/mcsherry/Projects/datafrog)
    Finished release [optimized] target(s) in 1.43 secs
     Running `target/release/graspan1 /Users/mcsherry/Desktop/graspan/httpd_df.dms`
Duration { secs: 1, nanos: 564660489 }  Data loaded
Duration { secs: 4, nanos: 521095814 }  Computation complete (nodes_final: 9393283)
Echidnatron%

About 1.5s to load the data, and then three more seconds to run the computation. Maybe add another 1.5s if you want to count the program compilation (seems legit, in some circumstances). How does that compare to the ASPLOS paper? Pretty well, it turns out. Well enough that I'm still blocked on them confirming that we've actually computed the same thing, before getting too sassy.

Next steps

Ergonomics seem to be a bit of a pain here. If you want to have more complicated rules, you need to define a variable for each binary join that you do, which results in a wall of definitions relatively far removed from their use. That is annoying. On the other hand, I'm hoping that a lot of this ends up being automatable from simpler Datalog syntax.

I would expect the ergonomics might improve in the future, as some folks do seem interested in using this in anger. The more eyeballs we get on it, and elbows applied to it, the better I see it getting.

If nothing else, I hope it is an informative and educational tool about how you too could totally build a Datalog engine. If there are any parts you still aren't confident about, give a holler and I can try and clarify out those aspects.

Addendum 2018-05-21: Treefrog Leapjoin

The Rust folks have a benchmark query they are interested in, related to a more sophisticated set of rules for their borrow checker (the part of Rust that does the static analysis on your program to see if it can rule out racy memory references). It is a lot more complicated than

nodes(y) <- nodes(x), edges(x,y)

If you take the naive form of their analysis, the rules look like

subset(R1, R2, P) <- outlives(R1, R2, P).
subset(R1, R3, P) <- subset(R1, R2, P), subset(R2, R3, P).
subset(R1, R2, Q) <- subset(R1, R2, P), cfg_edge(P, Q), region_live_at(R1, Q), region_live_at(R2, Q).

requires(R, B, P) <- borrow_region(R, B, P).
requires(R2, B, P) <- requires(R1, B, P), subset(R1, R2, P).
requires(R, B, Q) <- requires(R, B, P), !killed(B, P), cfg_edge(P, Q), region_live_at(R, Q).

borrow_live_at(B, P) <- requires(R, B, P), region_live_at(R, P)

The current bestest form of the analysis does not use these simple rules, but we will be able to do something instructive with them nonetheless. Let's focus on the rule

subset(R1, R2, Q) <- subset(R1, R2, P), cfg_edge(P, Q), region_live_at(R1, Q), region_live_at(R2, Q).

This rule happens to extend the subset relation with a new value Q by joining with three other relations. All of that work and we just get one additional Q out of it (and we discard the P). Let's see how it is written using the relatively simple framework up above:

    // subset_1 and subset_2 are temporary relations; subset_p is the real deal.
    subset_1.from_join(&subset_p, &cfg, |&_p, &(r1, r2), &q| ((r1, q), r2));
    subset_2.from_join(&subset_1, &rla, |&(r1, q), &r2, &()| ((r2, q), r1));
    subset_p.from_join(&subset_2, &rla, |&(r2, q), &r1, &()| (q, (r1, r2)));

This isn't the whole story for the computation, but the fragment shows us that we start with a subset_p relation that we join with a cfg, then twice with rla. The subset_p relation is subset indexed by P, so that a join with cfg adds in a Q field; we then join twice with the (R,Q) tuples in rla to restrict to tuples where both (R1,Q) and (R2,Q) are in rla.

For each (r1, r2, p) we are doing a fair bit of work just to get out the new q values. The cfg relation proposes them all, we then write them all down and re-order so that rla can test each of them, twice. Is there perhaps a better way?

The Treefrog Awakens

There is a neat algorithm called "leapfrog triejoin", the intellectual property of LogicBlox, Inc. The algorithm we are going to develop, "treefrog leapjoin", may seem similar, but it is a purely superficial connection. They are both connected to the study of worst-case optimal joins, and I think TFLJ is something of a dual to LFTJ.

I should also stress that while the name itself is novel, TFLJ the algorithm is not; it's totally just out of that WCOJ paper up there.

Here is the rough idea:

Imagine that someone provides us with a tuple (r1, r2, p) and asks for the list of q that should extend it. We have a few relations (cfg, rla, and rla again) that have opinions on which q should extend it, so let's maybe ask them.

Imagine each relation had a method

    fn propose(&self, tuple: &Tuple) -> Vec<Q>;

that would tell us what values it would propose for any instance of Tuple. We could take these lists and intersect them, and then list out the matching q values all in one fell swoop. No intermediate subset_1 and subset_2 relations, no sorting them by different keys or anything like that.

It is pretty easy to write a propose method like the above, beause for any tuple: &Tuple each relation just needs to know how to pick out the fields of tuple that are bound to non-Q variables in the relation. For example, cfg needs to be smart enough to pick out the P and then use it to look up Q values. Each of the uses of rla need to pick out either R1 or R2 (based on which use it is) and use it to look up Q values.

The relation itself will not be able to do this, which we can see from the two uses of rla which must do different things. Instead, we can wrap the relation in some type that will provide the logic we need perhaps just by associating a Fn(&Tuple)->Key function which extracts the relation's key from any supplied tuple.

    pub struct ExtendWith<'a, Key: Ord+'a, Val: Ord+'a, Tuple: Ord, Func: Fn(&Tuple)->Key> {
        relation: &'a Relation<(Key, Val)>,
        key_func: Func,
        phantom: ::std::marker::PhantomData<Tuple>,
    }

If you are still pondering Rust and all of its idosyncracies, there might be a bit to digest here. This type has several parameters, including the mysteriously ominous 'a parameter. You can approximately view this struct as pairing a reference to a Relation<(Key, Val)> with a specific function Fn(&Tuple)->Key. The 'a parameter is Rust's way of acknowledging that while this reference may not be around forever, when we talk about other references to things perhaps we can say that they will be around for roughly as long.

Here is a propose() method for ExtendWith (where we apparently populate a Vec<&'a Val> instead of returning one).

    fn propose(&mut self, prefix: &Tuple) -> Vec<&'a Val> {
        let key = (self.key_func)(prefix);
        let start = binary_search(&self.relation[..], |x| &x.0 < &key);
        let slice = &self.relation[self.start ..];
        let end = slice.len() - gallop(slice, |x| &x.0 <= &key).len();
        slice[start .. end].iter().map(|&(_, ref val)| val).collect()
    }

This would probably work pretty well, using propose for each relation wrapped with an ExtendWith, but we are going to do something a bit smarter and more flexible.

Enter the Treefrog

The propose method is a solid way to start out eliciting information from each of the participating relations, but we can do a bit better. Imagine that each participating relation provided three methods:

    /// Methods to support treefrog leapjoin.
    pub trait Leaper<'a, Tuple, Val> {
        /// Estimates the number of proposed values.
        fn count(&mut self, prefix: &Tuple) -> usize;
        /// Populates `values` with proposed values.
        fn propose(&mut self, prefix: &Tuple, values: &mut Vec<&'a Val>);
        /// Restricts `values` to proposed values.
        fn intersect(&mut self, prefix: &Tuple, values: &mut Vec<&'a Val>);
    }

The propose method is familiar, if slightly different, but what do the other two methods do? The count method estimates a number of values that would be proposed, and just needs to avoid being zero if the relation would actually propose some values. The intersect method takes a list of proposed values and restricts them down to those that the relation would propose; this can be a lot more efficient than actually proposing the values themselves (imagine a small values and a large list of proposals).

Using just this trait, let's write a leapjoin_into method that extends any Variable<Tuple> using a list of these Leaper implementors:

/// Performs treefrog leapjoin using a list of leapers.
pub fn leapjoin_into<'a, Tuple: Ord, Val: Ord+'a, Result: Ord>(
    source: &Variable<Tuple>,
    leapers: &mut [&mut Leaper<'a, Tuple, Val>],
    output: &Variable<Result>,
    mut logic: impl FnMut(&Tuple, &Val)->Result) {

    let mut result = Vec::new();    // temp output storage.
    let mut values = Vec::new();    // temp value storage.

    for tuple in source.recent.borrow().iter() {

        // 1. Determine which leaper would propose the fewest values.
        // 2. Have the least-proposing leaper propose their values.
        // 3. Have the other leapers restrict the proposals.
        // 4. Call `logic` on each value, push into `result`.

    }

    output.insert(result.into());
}

These four steps are not so complicated, I just wanted to avoid a wall of text. Let's look at each of them individually:

        // 1. Determine which leaper would propose the fewest values.
        let mut min_index = usize::max_value();
        let mut min_count = usize::max_value();
        for index in 0 .. leapers.len() {
            let count = leapers[index].count(tuple);
            if min_count > count {
                min_count = count;
                min_index = index;
            }
        }

Great! Super easy. We just ask each leaper, and they tell us. The rest is pretty easy too:

        if min_count > 0 {

            // 2. Have the least-proposing leaper propose their values.
            leapers[min_index].propose(tuple, &mut values);

            // 3. Have the other leapers restrict the proposals.
            for index in 0 .. leapers.len() {
                if index != min_index {
                    leapers[index].intersect(tuple, &mut values);
                }
            }

            // 4. Call `logic` on each value, push into `result`.
            for val in values.drain(..) {
                result.push(logic(tuple, val));
            }
        }

That is the whole leapjoin_into implementation!

Other flavors of TreeFrog

While one of the reasons for the three methods (count, propose, and intersect) is efficiency, another is flexibility. To see what this is about, let's consider an query fragment from Rust's "more optimized" version of their borrow checker logic.

dead_region_requires(R, B, P, Q) :-
    requires(R, B, P),
    !killed(B, P),
    cfg_edge(P, Q),
    !region_live_at(R, Q).

This method starts with triples (r, b, p) from the requires variable, and does various restrictions to introduce a new q.

However, these constraints aren't just a matter of proposing values for q. The !killed(B, P) constraint says that no q are appropriate for any (b, p) in killed. The !rla(R, Q) says that only q not present in rla are appropriate.

These two negative constraints would be very hard to write in terms of propose. We can't list out the values that would be satisfactory, because there could be an unbounded number of them. When (b, p) is not in killed, literally any value of q is just fine. The rla relation would have to list the values not associated with r, which while not any value of q are still most of them.

Fortunately, even for these two classes of constraint we can implement the Leaper trait, with methods

    fn count(&mut self, prefix: &Tuple) -> usize;
    fn propose(&mut self, prefix: &Tuple, values: &mut Vec<&'a Val>);
    fn intersect(&mut self, prefix: &Tuple, values: &mut Vec<&'a Val>);

A bit of pondering, and you may conclude that you can implement each of these methods for the two types of constraints above. Where we called our first Leaper type ExtendWith, I'm going to call these two FilterAnti and ExtendAnti for the killed and rla uses respectively.

The FilterAnti implementation should report a count of either 0 or usize::max_value(), depending on whether the tuple's key and value are present in the relation or not. It is welcome to panic in its propose implementation (something has gone wrong if we are there), and its intersect implementation can be empty (we could only be there if we reported a count of usize::max_value()).

The ExtendAnti implementation shourd report a count of usize::max_value(), panic in propose(), and perform roughly the same logic for intersect as does ExtendWith but complementing its decisions (retaining elements when they are absent from the relation).

Notice that it is important that leapjoin_into() does nothing when a min_count of zero comes back. That is a lot cheaper than asking the FilterAnti and ExtendAnti wrappers to double check that the tuple is absent and propose nothing.

Consequences

Let's check out the performance implications on Rust's prototype borrow checker.

We can grab and run the current version (which uses DataFrog, but not the treefrog leapjoin):

Echidnatron% git clone https://github.com/rust-lang-nursery/polonius/
Echidnatron% cd polonius
Echidnatron% cargo +nightly run --release -- -a DatafrogOpt inputs/clap-rs/app-parser-{{impl}}-add_defaults/ --skip-tuples
...
Time: 7.692s
Echidnatron%

Before getting too excited, we should update Cargo.toml to point at the current datafrog master, which has some recent improvements.

Echidnatron% nano Cargo.toml
Echidnatron% cargo +nightly run --release -- -a DatafrogOpt inputs/clap-rs/app-parser-{{impl}}-add_defaults/ --skip-tuples
...
Time: 6.985s
Echidnatron%

Now let's do a bit of editing in datafrog_opt.rs, which is where their logic lives. I went and did this by hand, and it's not pretty (but arguably not less pretty than before). You end up with a bunch of code that looks like

    // dead_region_requires((R, P, Q), B) :-
    //   requires(R, B, P),
    //   !killed(B, P),
    //   cfg_edge(P, Q),
    //   !region_live_at(R, Q).
    leapjoin_into(
        &requires_rp,           // dynamic source
        &mut [
            &mut killed.filter_anti(|&((_r,p),b)| (b,p)),
            &mut cfg_edge_rel.extend_with(|&((_r,p),_b)| p),
            &mut region_live_at_rel.extend_anti(|&((r,_p),_b)| r),
        ],
        &dead_region_requires,  // destination
        |&((r,p),b),&q| ((r,p,q),b)
    );

instead of

    // dead_region_requires((R, P, Q), B) :-
    //   requires(R, B, P),
    //   !killed(B, P),
    //   cfg_edge(P, Q),
    //   !region_live_at(R, Q).
    dead_region_requires_1.from_antijoin(&requires_bp, &killed, |&(b, p), &r| (p, (b, r)));
    dead_region_requires_2.from_join(
        &dead_region_requires_1,
        &cfg_edge_p,
        |&p, &(b, r), &q| ((r, q), (b, p)),
    );
    dead_region_requires.from_antijoin(
        &dead_region_requires_2,
        &region_live_at_rel,
        |&(r, q), &(b, p)| ((r, p, q), b),
    );

In any case, you also end up improving the time from 6.985 seconds to

Echidnatron% nano src/output/datafrog_opt.rs
Echidnatron% cargo +nightly run --release -- -a DatafrogOpt inputs/clap-rs/app-parser-{{impl}}-add_defaults/ --skip-tuples
...
Time: 5.275s
Echidnatron%

This isn't the world's greatest speed-up, but it is pretty solid. There are also several further avenues to explore, including better ways to perform intersection (right now we always use gallop) and replacing Vec<&'a Val> lists of pointers with just Vec<Val> because all of our Val types are just integers.

For our purposes, we have ideally learned a bit more about how to implement clever join strategies, while additionally supporting various flavors of negative constraints (though only in static relations, rather than variables).