Skip to content

v0.0.0.9004

Pre-release
Pre-release
Compare
Choose a tag to compare
@naren-srinivasan naren-srinivasan released this 14 Dec 07:37
· 42 commits to master since this release

Key features added in this release

Python functions in analysis pipelines

Python functions can now be defined, and Python files sourced through the reticulate package. These functions which have a reference in the R environment can be registered to and AnalysisPipeline object. This means that an interoperable pipeline can be created comprising of R, Spark & Python functions.

A vignette on how to use Python functions has been added. The Interoperable pipelines vignette has also been updated to showcase pipelines with all 3 engines.

Improved formula parsing & execution

Some logical inconsistencies in formula semantics have been resolved. Now, for data functions, the data argument can be either one of 3 things:

  • Not passed, meaning that it is a pronoun which should operate on the input that the pipeline object has been instantiated with
  • A data frame explicitly passed
  • A formula - which denotes an output of a previous function

Additionally, if the input type of the data argument of data function is one of R data frame, Spark DataFrame, or Pandas DataFrame; type conversion is automatically performed according to the engine of that data function.