This blog post is a recap of the mlr-org workshop 2021 which took place from the 28th of September until the 10th of October.
This blog post is a recap of the mlr-org workshop 2021 which took place from the 28th of September until the 10th of October.
First of all, we would like to thank all people and organizations which made this workshop possible:
If you like our work and want to see more of such workshops where we are able to devot our full time to mlr3, consider becoming a GitHub sponsor ❤️
For our core package {mlr3} we mainly focused on maintenance and issue curation. A little new goodie relates to scoring and aggregating: one can now conduct more complex benchmarks using different evaluation metics.
{mlr3pipelines} has also seen a lot of maintenance in the first place. In addition, we did some code profiling and were able to improve the performance a bit by reducing the overhead when cloning Paramsets and learners. This comes with the new %>>!%
operator which concatenetes partial Graphs
in place and is essentially carrying the memory and speed improvements from above.
A new sugar function pos()
now exists, making it possible to initialize multiple pipe operators more easily.
We also started working on adding more sugar to {paradox} (e.g. trafos) which should make life easier in {mlr3pipelines} in the long run.
Last, we overwhauled some error messages internally to make them more descriptive for users.
mlr3tuning / bbotk / mlr3mbo / mlr3hyperband
One focus with respect to tuning was on “hot starting” learners. This means that learners can save their tuning state at an arbitrary point and can be retrained later on starting from this point. This means that tuning can be done in decoupled chunks instead of one big task. This is especially helpful for certain tuning methods like “hyperband”, which is why we have listed this feature in this section.
Another improvement for {bbotk} and all tuning packages was the support for asynchronous evaluations of hyperparameter configurations. Before, one had to wait until all hyperband configurations were evaluated to propose a new point. Now, new points can be evaluated at any time, making more efficient use of parallelization. This also avoids the issue of waiting on a few slow workers with an ineffecient configuriation that takes a long time to finish.
We made great progress in finalizing {mlr3mbo}. {mlr3mbo} is a flexible Bayesian optimization toolkit and much more modular than its predecessor {mlrMBO}. Most of the functionality {mlrMBO} offers is already integrated, and we are looking forward to a CRAN release in the near future - in the meantime make sure to check out the package on GitHub.
For our curated set of learners we updated some ParamSet to the most recent versions and ensured that our custom “Paramtest”, i.e. the heuristic how we validate the ParamSets of mlr3learners against their upstream packages, works again smoothly.
{mlr3} now allows for bias auditing and debiasing through mlr3fairness for arbitrary learners. The package contains Fairness Metrics, Debiasing Strategies and Visualizations for arbitrary models trained through {mlr3}. We plan a to realease on CRAN soon, but the package is already usable. If you wanna learn more, check out the tutorials for the individual components:
We created a new ggplot2 theme theme_mlr3()
which is being applied by default to all plots created with {mlr3viz}, i.e. all autoplot()
methods in mlr3. This theme is heavily inspired by ggpubr::theme_pubr()
. In addition to the theme, all mlr3viz plots now use the viridis color palette by default. Last, we have added some information how users can easily apply theming changes to the plots returned by autoplot()
.
mlr3spatial is a new package for spatial backends able to handle {sf}, {stars} and {terra} objects. It is capable of performing predictions on spatial raster objects in parallel using the internal mlr3 parallelization heuristic (which uses the {future} framework under the hood). Together with the {mlr3spatiotempcv} package it extends the support for spatial machine learning in mlr3.
We have createad a Roadmap of upcoming packages and planned features across the whole mlr3 ecosystem. This roadmap aims both for internal organization and giving external people an idea of what is upcoming. The roadmap is quite new and not yet fully operational at the time this blog post got released - however, we encourage to look at it from time to time as we try to keep it up-to-date.
We have updated and reorganized our mlr3 wiki. It now has a better sidebar appearance for easier navigation and we have updated the sections with respect to their content.
For attribution, please cite this work as
Schratz (2021, Oct. 7). mlr-org: mlr Workshop 2021 Recap. Retrieved from https://mlr-org.github.io/mlr-org-website/posts/2021-10-07-mlr-workshop-2021-recap/
BibTeX citation
@misc{schratz2021mlr, author = {Schratz, Patrick}, title = {mlr-org: mlr Workshop 2021 Recap}, url = {https://mlr-org.github.io/mlr-org-website/posts/2021-10-07-mlr-workshop-2021-recap/}, year = {2021} }