Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Make Set.to_array use Enum instead of Bigarray #724

Merged

Conversation

thizanne
Copy link
Member

@thizanne thizanne commented Feb 15, 2017

This breaks the dependency Set -> Bigarray -> Int. A quick benchmark
also shows that the new function has comparable performances and can avoid
Out_of_memory.

This is required for having Int.Set as an instantion of Set, see #712.

# Benchmark results (see `conversion`, smaller is better)

$ ./bench.native old 1000
old: creation 0.000853, conversion 0.000130
$ ./bench.native new 1000
new: creation 0.000819, conversion 0.000198

$ ./bench.native old 10000
old: creation 0.008550, conversion 0.001304
$ ./bench.native new 10000
new: creation 0.004781, conversion 0.000600

$ ./bench.native old 1000000
old: creation 0.280667, conversion 0.041365
$ ./bench.native new 1000000
new: creation 0.279677, conversion 0.050910

$ ./bench.native old 10000000
old: creation 2.750616, conversion 0.562951
$ ./bench.native new 10000000
new: creation 2.752049, conversion 0.501288

$ ./bench.native old 100000000
Fatal error: exception Out of memory
$ ./bench.native new 100000000
new: creation 30.198271, conversion 5.130372
(* Benchmark code *)
module S = BatSet.Make (BatInt)

let old_to_array s =
  let acc = BatDynArray.create () in
  S.iter (BatDynArray.add acc) s;
  BatDynArray.to_array acc

let to_array = match Sys.argv.(1) with
  | "old" -> old_to_array
  | "new" -> S.to_array
  | _ -> assert false

let nb_elem = int_of_string (Sys.argv.(2))

let t0 = Unix.gettimeofday ()

let set =
  S.of_enum @@ BatEnum.range 0 ~until:nb_elem

let t1 = Unix.gettimeofday ()

let arr = to_array set

let t2 = Unix.gettimeofday ()

let () =
  Printf.printf "%s: creation %f, conversion %f\n"
    Sys.argv.(1)
    (t1 -. t0)
    (t2 -. t1)

let () =
  if Array.length arr <= 50 then begin
    print_endline "debug:";
    BatArray.print BatInt.print BatIO.stdout arr
  end

@gasche
Copy link
Member

gasche commented Feb 15, 2017

You meant DynArray rather than BigArray.

Could you please compare with a third implementation that would (1) compute the size of the set, (2) allocate an array of the right size and (3) iterate on the set again to define elements.

@thizanne
Copy link
Member Author

I considered this at first, but I do not know how to allocate an uninitialised array, and I thought using Array.make and then S.iter had no way to be faster than enum. Is there such a way or should I go with Array.make ?

@gasche
Copy link
Member

gasche commented Feb 15, 2017

Well you can ask the set for an element, with choose or with direct pattern-matching, to initialize the array.

I think on the contrary that S.iter has chances to be faster than enum, because it is in control of the computation (while Enum is written in a style that gives control to the caller, which is more convenient but a bit more difficult). Also Enum is a complex module to depend on, so it is cleaner if we can avoid it.

This breaks the dependency Set -> Dynarray -> Int. A quick benchmark
also shows that the new function is approximately twice as fast and
can avoid Out_of_memory.
@thizanne
Copy link
Member Author

It is approximately twice as fast indeed. I updated and rebased.

| Node (_, e, _, _) ->
let arr = Array.make (cardinal s) e in
let i = ref 0 in
iter (fun x -> Array.unsafe_set arr (!i) x; incr i) s;
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure if this is the recommended style to write (!i) instead of !i here.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't have a strong preference, do as you think is best.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's keep it this way then, unless someone else objects. Are you ok to merge ?

@gasche
Copy link
Member

gasche commented Feb 15, 2017

You should add a Changes entry, otherwise this seems ready to go.

One thing though, it would be nice if we had a couple tests for this function.

@gasche gasche merged commit 0550656 into ocaml-batteries-team:master Feb 15, 2017
@thizanne thizanne mentioned this pull request Feb 15, 2017
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants