Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

TypedTables.Table as a sink #73

Closed
non-Jedi opened this issue Nov 1, 2019 · 3 comments · Fixed by #75
Closed

TypedTables.Table as a sink #73

non-Jedi opened this issue Nov 1, 2019 · 3 comments · Fixed by #75
Labels
enhancement New feature or request

Comments

@non-Jedi
Copy link

non-Jedi commented Nov 1, 2019

I was playing around with using Transducers with TypedTables.jl. It works relatively well out of the box:

julia> t = Table(a=rand(1:100, 10), b=rand(10))
Table with 2 columns and 10 rows:
      a   b
    ┌──────────────
 168  0.0588238
 284  0.316647
 335  0.619855
 412  0.583985
 523  0.155771
 692  0.44839
 713  0.547894
 857  0.283057
 981  0.361922
 1056  0.368723

julia> f = Filter(x -> iseven(x.a)) |> Map(x -> (a=x.a, b=2x.b))
Filter(Main.λ❓) |>
    Map(Main.λ❓)

julia> t2 = Table(collect(f, t))
Table with 2 columns and 5 rows:
     a   b
   ┌─────────────
 168  0.117648
 284  0.633293
 312  1.16797
 492  0.896779
 556  0.737446

The one problem I see here is calling collect before the Table constructor causes unnecessary copying (creates a Vector{NamedTuple{(:a, :b),Tuple{Int64,Float64}}} which then has to be converted to a NamedTuple{(:a, :b),Tuple{Vector{Int64},Vector{Float64}}}). Using collect only works because Tables.jl defines a Vector{T} where {T<:NamedTuple} as a table.

Any ideas for making Transducers more general in this way to avoid unnecessary copying?

@non-Jedi
Copy link
Author

non-Jedi commented Nov 1, 2019

I'm a dummy. copy! can be used directly for this. I thought Tables were meant to be wholly immutable, but they define both push! and empty!.

julia> t2 = copy!(f, similar(t, 0), t)
Table with 2 columns and 4 rows:
     a   b
   ┌─────────────
 112  0.899638
 296  1.48997
 384  1.96983
 482  0.385463

@non-Jedi non-Jedi closed this as completed Nov 1, 2019
@non-Jedi
Copy link
Author

non-Jedi commented Nov 1, 2019

Would it be worthwhile to define Base.copy(f::Transducer, x) = Base.copy!(f, similar(x, 0), x) to make this kind of thing more ergonomic?

@non-Jedi non-Jedi reopened this Nov 1, 2019
@tkf
Copy link
Member

tkf commented Nov 1, 2019

copy(f::Transducer, x) sounds like a good idea. Another possible name is map. But whether or not it is the right thing to depend on what Base decides to do with map(f, dict): JuliaLang/julia#5794

BTW, I think copy!(f, similar(t, 0), t) would fail if you change row type in f (e.g., add a column). Ideally, something like

foldl(push!!, xf, input_table; init=Table())

should work. But Table() throws and push!! is not defined for Table.

@tkf tkf added the enhancement New feature or request label Nov 1, 2019
@tkf tkf changed the title Unnecessary copying with TypedTables.jl TypedTables.Table as a sink Nov 5, 2019
@tkf tkf closed this as completed in #75 Nov 5, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants