(experimental) Conditional Inference Tree Conditional Sampler

Implements conditional sampling using conditional inference trees (ctree). Builds a tree predicting target features from conditioning features, then samples from the terminal node corresponding to each test observation.

Details

This sampler approximates the conditional distribution $P(X_B | X_A = x_A)$ by:

Building a conditional inference tree with $X_B$ as response and $X_A$ as predictors
For each test observation, finding its terminal (leaf) node in the tree
Sampling uniformly from training observations in that same terminal node

Conditional inference trees (ctree) use permutation tests to determine splits, which helps avoid overfitting and handles mixed feature types naturally. The tree partitions the feature space based on the conditioning variables, creating local neighborhoods that respect the conditional distribution structure.

Key advantages over other samplers:

Handles mixed feature types (continuous and categorical)
Non-parametric (no distributional assumptions)
Automatic feature selection (splits only on informative features)
Can capture non-linear conditional relationships
Statistically principled splitting criteria

Hyperparameters control tree complexity:

mincriterion: Significance level for splits (higher = fewer splits)
minsplit: Minimum observations required for a split
minbucket: Minimum observations in terminal nodes

This implementation is inspired by shapr's ctree approach but simplified for our use case (we build trees on-demand rather than pre-computing all subsets).

Advantages:

Works with any feature types
Robust to outliers
Interpretable tree structure
Handles high-dimensional conditioning

Limitations:

Requires model fitting (slower than kNN)
Can produce duplicates if terminal nodes are small
Tree building time increases with data size

References

Hothorn T, Hornik K, Zeileis A (2006). “Unbiased Recursive Partitioning: A Conditional Inference Framework.” Journal of Computational and Graphical Statistics, 15(3), 651–674. doi:10.1198/106186006X133933 .

Aas K, Jullum M, Løland A (2021). “Explaining Individual Predictions When Features Are Dependent: More Accurate Approximations to Shapley Values.” Artificial Intelligence, 298, 103502. doi:10.1016/j.artint.2021.103502 .

Super classes

xplainfi::FeatureSampler -> xplainfi::ConditionalSampler -> ConditionalCtreeSampler

Public fields

feature_types: (character()) Feature types supported by the sampler. Will be checked against the provied mlr3::Task to ensure compatibility.
tree_cache: (environment) Cache for fitted ctree models.

Methods

Inherited methods

Method `new()`

Creates a new ConditionalCtreeSampler.

Usage

ConditionalCtreeSampler$new(
  task,
  conditioning_set = NULL,
  mincriterion = 0.95,
  minsplit = 20L,
  minbucket = 7L,
  use_cache = TRUE
)

Arguments

task: (mlr3::Task) Task to sample from.
conditioning_set: (character | NULL) Default conditioning set to use in $sample().
mincriterion: (numeric(1): 0.95) Significance level threshold for splitting (1 - p-value). Higher values result in fewer splits (simpler trees).
minsplit: (integer(1): 20L) Minimum number of observations required for a split.
minbucket: (integer(1): 7L) Minimum number of observations in terminal nodes.
use_cache: (logical(1): TRUE) Whether to cache fitted trees.

Method `clone()`

The objects of this class are cloneable with this method.

Usage

ConditionalCtreeSampler$clone(deep = FALSE)

Arguments

deep: Whether to make a deep clone.

Examples

if (FALSE) { # \dontrun{
library(mlr3)
task = tgen("friedman1")$generate(n = 100)

# Create sampler with default parameters
sampler = ConditionalCtreeSampler$new(task)

# Sample features conditioned on others
test_data = task$data(rows = 1:5)
sampled = sampler$sample_newdata(
  feature = c("important2", "important3"),
  newdata = test_data,
  conditioning_set = "important1"
)
} # }

(experimental) Conditional Inference Tree Conditional Sampler

Details

References

Super classes

Public fields

Methods

Public methods

Method new()

Usage

Arguments

Method clone()

Usage

Arguments

Examples

Method `new()`

Method `clone()`