Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Feature Request: Class Weighting capabilities #229

Open
irslushy opened this issue Nov 29, 2023 · 1 comment
Open

Feature Request: Class Weighting capabilities #229

irslushy opened this issue Nov 29, 2023 · 1 comment

Comments

@irslushy
Copy link

I'm working on creating a Random Forest classification model for a dataset that has an unequal class balance. Other packages such as scikit-learn provide a "class weight" functionality docs which allows the minority class(es) to be weighted more heavily in the training of the individual trees. As far as I can tell, there isn't any functionality like that in any Julia decision tree implementation. Would this be possible to add?

@ablaom
Copy link
Member

ablaom commented Nov 29, 2023

Yes, this would be nice to have.

The sk-learn model does have class_weight and this is exposed in the MLJ wrapper. Unfortunately, passing a julia dict does not appear to work. Watch the linked issue for a possible workaround.

I haven't looked at the ScikitLearn.jl wrapper.

DecisionTree.jl has low maintenance from a few volunteers. If you'd like this feature added, your best chance is to make a PR, assuming you have the expertise. Be happy to review if someone else doesn't have the time.

It would be worth looking at the python code because it is based on C code which I think was ported to DecisionTree.jl, but I don't recall any accomodation for weights there. Or maybe the C code was just for individual trees. Sorry, I don't remember just now.

My suggestion would be to support per-observation weights first, and build class weight support on top of that (by using an analogue of the tool you linked to, which is something like this).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants