RFI generalizes CFI and PFI with arbitrary conditioning sets and samplers.
References
König G, Molnar C, Bischl B, Grosse-Wentrup M (2021). “Relative Feature Importance.” In 2020 25th International Conference on Pattern Recognition (ICPR), 9318–9325. doi:10.1109/ICPR48806.2021.9413090 .
Super classes
xplainfi::FeatureImportanceMethod -> xplainfi::PerturbationImportance -> RFI
Methods
Method new()
Creates a new instance of the RFI class
Usage
RFI$new(
task,
learner,
measure = NULL,
resampling = NULL,
features = NULL,
groups = NULL,
conditioning_set = NULL,
relation = "difference",
n_repeats = 1L,
batch_size = NULL,
sampler = NULL
)Arguments
task, learner, measure, resampling, features, groups, relation, n_repeats, batch_sizePassed to PerturbationImportance.
conditioning_set(
character()) Set of features to condition on. Can be overridden in$compute(). Default (character(0)) is equivalent toPFI. InCFI, this would be set to all features except that of interest.sampler(ConditionalSampler) Optional custom sampler. Defaults to
ConditionalARFSampler.
Method compute()
Compute RFI scores
Usage
RFI$compute(
conditioning_set = NULL,
n_repeats = NULL,
batch_size = NULL,
store_models = TRUE,
store_backends = TRUE
)Arguments
conditioning_set(
character()) Set of features to condition on. IfNULL, uses the stored parameter value.n_repeats(
integer(1)) Number of permutation iterations. IfNULL, uses stored value.batch_size(
integer(1)|NULL:NULL) Maximum number of rows to predict at once. IfNULL, uses stored value.store_models, store_backends(
logical(1):TRUE) Whether to store fitted models / data backends, passed to mlr3::resample internally for the initial fit of the learner. This may be required for certain measures and is recommended to leave enabled unless really necessary.
Examples
library(mlr3)
task = tgen("friedman1")$generate(n = 200)
rfi = RFI$new(
task = task,
learner = lrn("regr.ranger", num.trees = 50),
measure = msr("regr.mse"),
conditioning_set = c("important1")
)
#> ℹ No <ConditionalSampler> provided, using <ConditionalARFSampler> with default settings.
#> ℹ No <Resampling> provided, using `resampling = rsmp("holdout", ratio = 2/3)`
#> (test set size: 67)
rfi$compute()
rfi$importance()
#> Key: <feature>
#> feature importance
#> <char> <num>
#> 1: important1 0.00000000
#> 2: important2 6.56214678
#> 3: important3 1.85528306
#> 4: important4 9.01753438
#> 5: important5 2.36772225
#> 6: unimportant1 0.19616888
#> 7: unimportant2 0.09921649
#> 8: unimportant3 0.14696144
#> 9: unimportant4 -0.37958579
#> 10: unimportant5 0.34660008