Skip to content

Latest commit

 

History

History
26 lines (17 loc) · 1.2 KB

README.md

File metadata and controls

26 lines (17 loc) · 1.2 KB

RFE-cross-validation

RFE (Recursive Feature Elimination) with cross validation

RFECV from scikit-learn optimizes the number of best features [1] but is usual the desire to know the selected variables for other cases. The RFE function [2] addresses this issue but is it not cross-validated.

A cross-validation for feature importance can give the idea of how a given feature is well established in the feature importance rank. The rank variability is shown by the standard deviation of the rank mean. As the next figure shows:

image

Figure: Mean rank across folds for each feature.

TO-DO:

  • Separate module file and example script;
  • Prettier output or returning of results;
  • Input parameters for figures (done for figsize);
  • Get rid of commentaries in portuguese (on the way);
  • Return of figures instead of plotting;
  • Better check of inputs (types);
  • Consider the variability of the std deviation of the mean;
  • Richer example.

[1] http://scikit-learn.org/stable/modules/generated/sklearn.feature_selection.RFECV.html

[2] https://scikit-learn.org/stable/modules/generated/sklearn.feature_selection.RFE.html