Hopping on the idea of the “advent calendar” that is very typical in Germany (at least in Bavaria), during this month I’ll be sharing every day one link/info I’ve read recently in the field of AI/ML that I found interesting.
For
a start, my suggestion of the day is FLAML (https://github.com/microsoft/FLAML).
This is an AutoML python library available on GitHub and created by people at
MSR, based on research published end of last year. Why another one, especially
considering we already have this on Azure Machine Learning? A few reasons I like it:
- It’s fully
Python based with a super-simple API. You have control of all the
parameters in your experiments in code that you can source control.
- In my
experiments (for tabular data + classification/regression), I got
consistently good results.
- You can set a
training budget – for how long you want it to train.
- You can pick
the algorithms you want to use in the training (most common being
LightGBM, XGBoost and Catboost) – and if you pick only one, you’re
effectively doing hyper-param tuning.
- Supports
sklearn pipelines (i.e., you can for example do Automl as the final step
of a training pipeline).
- You can do an
optimization run based on a previous run, to further optimize results
you’ve already obtained.
- Has support
for ensembles/stacks, where a set of models are trained to do a 1st prediction,
and then a final estimator builds on the outputs of the previous
predictors to make a final prediction.
- And obviously,
runs in AML (albeit not benefiting from clusters/parallelization).
Two
other relevant links I’d also include are Optuna (https://optuna.org/),
a library specifically for hyperparam tuning (similar to HyperOpt), and
LightAutoML (https://github.com/sberbank-ai-lab/LightAutoML),
both widely used in Kaggle competitions.
Cheers!
No comments:
Post a Comment