This page lists the learning methods already integrated in mlr.

Columns Num., Fac., Ord., NAs, and Weights indicate if a method can cope with numerical, factor, and ordered factor predictors, if it can deal with missing values in a meaningful way (other than simply removing observations with missing values) and if observation weights are supported.

Column Props shows further properties of the learning methods specific to the type of learning task. See also RLearner() for details.

Classification (84)

For classification the following additional learner properties are relevant and shown in column Props:

  • prob: The method can predict probabilities,
  • oneclass, twoclass, multiclass: One-class, two-class (binary) or multi-class classification problems be handled,
  • class.weights: Class weights can be handled.
Class / Short Name / Name Packages Num. Fac. Ord. NAs Weights Props Note
classif.ada
ada

ada Boosting
ada
rpart
X X prob
twoclass
xval has been set to 0 by default for speed.
classif.adaboostm1
adaboostm1

ada Boosting M1
RWeka X X prob
twoclass
multiclass
NAs are directly passed to WEKA with na.action = na.pass.
classif.bartMachine
bartmachine

Bayesian Additive Regression Trees
bartMachine X X X prob
twoclass
use_missing_data has been set to TRUE by default to allow missing data support.
classif.binomial
binomial

Binomial Regression
stats X X X prob
twoclass
Delegates to glm with freely choosable binomial link function via learner parameter link. We set ‘model’ to FALSE by default to save memory.
classif.boosting
adabag

Adabag Boosting
adabag
rpart
X X X prob
twoclass
multiclass
featimp
xval has been set to 0 by default for speed.
classif.bst
bst

Gradient Boosting
bst
rpart
X twoclass Renamed parameter learner to Learner due to nameclash with setHyperPars. Default changes: Learner = "ls", xval = 0, and maxdepth = 1.
classif.C50
C50

C50
C50 X X X X prob
twoclass
multiclass
classif.cforest
cforest

Random forest based on conditional inference trees
party X X X X X prob
twoclass
multiclass
featimp
See ?ctree_control for possible breakage for nominal features with missingness.
classif.clusterSVM
clusterSVM

Clustered Support Vector Machines
SwarmSVM
LiblineaR
X twoclass centers set to 2 by default.
classif.ctree
ctree

Conditional Inference Trees
party X X X X X prob
twoclass
multiclass
See ?ctree_control for possible breakage for nominal features with missingness.
classif.cvglmnet
cvglmnet

GLM with Lasso or Elasticnet Regularization (Cross Validated Lambda)
glmnet X X X prob
twoclass
multiclass
The family parameter is set to binomial for two-class problems and to multinomial otherwise. Factors automatically get converted to dummy columns, ordered factors to integer. glmnet uses a global control object for its parameters. mlr resets all control parameters to their defaults before setting the specified parameters and after training. If you are setting glmnet.control parameters through glmnet.control, you need to save and re-set them after running the glmnet learner.
classif.dbnDNN
dbn.dnn

Deep neural network with weights initialized by DBN
deepnet X prob
twoclass
multiclass
output set to "softmax" by default.
classif.dcSVM
dcSVM

Divided-Conquer Support Vector Machines
SwarmSVM
e1071
X twoclass
classif.earth
fda

Flexible Discriminant Analysis
earth
stats
X X X prob
twoclass
multiclass
This learner performs flexible discriminant analysis using the earth algorithm. na.action is set to na.fail and only this is supported.
classif.evtree
evtree

Evolutionary learning of globally optimal trees
evtree X X X X prob
twoclass
multiclass
pmutatemajor, pmutateminor, pcrossover, psplit, and pprune, are scaled internally to sum to 100.
classif.extraTrees
extraTrees

Extremely Randomized Trees
extraTrees X X prob
twoclass
multiclass
classif.fdausc.glm
fdausc.glm

Generalized Linear Models classification on FDA
fda.usc prob
twoclass
multiclass
functionals
model$C[[1]] is set to quote(classif.glm)
classif.fdausc.kernel
fdausc.kernel

Kernel classification on FDA
fda.usc prob
twoclass
multiclass
single.functional
Argument draw=FALSE is used as default.
classif.fdausc.knn
fdausc.knn

fdausc.knn
fda.usc X prob
twoclass
multiclass
single.functional
Argument draw=FALSE is used as default.
classif.fdausc.np
fdausc.np

Nonparametric classification on FDA
fda.usc prob
twoclass
multiclass
single.functional
Argument draw=FALSE is used as default. Additionally, mod$C[[1]] is set to quote(classif.np)
classif.FDboost
FDboost

Functional linear array classification boosting
FDboost
mboost
X prob
twoclass
functionals
Uses only one base learner per functional or scalar covariate. Uses the same hyperparameters for every baselearner. Currently does not support interaction between scalar covariates. Default for family has been set to ‘Binomial’, as ‘Gaussian’ is not applicable.
classif.featureless
featureless

Featureless classifier
mlr X X X X prob
twoclass
multiclass
functionals
classif.fgam
FGAM

functional general additive model
refund prob
twoclass
functionals
single.functional
classif.fnn
fnn

Fast k-Nearest Neighbour
FNN X twoclass
multiclass
classif.gamboost
gamboost

Gradient boosting with smooth components
mboost X X X prob
twoclass
family has been set to Binomial() by default. For ‘family’ ‘AUC’ and ‘AdaExp’ probabilities cannot be predicted.
classif.gaterSVM
gaterSVM

Mixture of SVMs with Neural Network Gater Function
SwarmSVM X twoclass m set to 3 and max.iter set to 1 by default.
classif.gausspr
gausspr

Gaussian Processes
kernlab X X prob
twoclass
multiclass
Kernel parameters have to be passed directly and not by using the kpar list in gausspr. Note that fit has been set to FALSE by default for speed.
classif.gbm
gbm

Gradient Boosting Machine
gbm X X X X prob
twoclass
multiclass
featimp
keep.data is set to FALSE to reduce memory requirements. Param ‘n.cores’ has been to set to ‘1’ by default to suppress parallelization by the package.
classif.geoDA
geoda

Geometric Predictive Discriminant Analysis
DiscriMiner X twoclass
multiclass
classif.glmboost
glmboost

Boosting for GLMs
mboost X X X prob
twoclass
family has been set to Binomial by default. For ‘family’ ‘AUC’ and ‘AdaExp’ probabilities cannot be predcited.
classif.glmnet
glmnet

GLM with Lasso or Elasticnet Regularization
glmnet X X X prob
twoclass
multiclass
The family parameter is set to binomial for two-class problems and to multinomial otherwise. Factors automatically get converted to dummy columns, ordered factors to integer. Parameter s (value of the regularization parameter used for predictions) is set to 0.01 by default, but needs to be tuned by the user. glmnet uses a global control object for its parameters. mlr resets all control parameters to their defaults before setting the specified parameters and after training. If you are setting glmnet.control parameters through glmnet.control, you need to save and re-set them after running the glmnet learner.
classif.h2o.deeplearning
h2o.dl

h2o.deeplearning
h2o X X X X prob
twoclass
multiclass
featimp
The default value of missing_values_handling is "MeanImputation", so missing values are automatically mean-imputed.
classif.h2o.gbm
h2o.gbm

h2o.gbm
h2o X X X prob
twoclass
multiclass
featimp
‘distribution’ is set automatically to ‘gaussian’.
classif.h2o.glm
h2o.glm

h2o.glm
h2o X X X X prob
twoclass
featimp
family is always set to "binomial" to get a binary classifier. The default value of missing_values_handling is "MeanImputation", so missing values are automatically mean-imputed.
classif.h2o.randomForest
h2o.rf

h2o.randomForest
h2o X X X prob
twoclass
multiclass
featimp
classif.IBk
ibk

k-Nearest Neighbours
RWeka X X prob
twoclass
multiclass
classif.J48
j48

J48 Decision Trees
RWeka X X X prob
twoclass
multiclass
NAs are directly passed to WEKA with na.action = na.pass.
classif.JRip
jrip

Propositional Rule Learner
RWeka X X X prob
twoclass
multiclass
NAs are directly passed to WEKA with na.action = na.pass.
classif.kknn
kknn

k-Nearest Neighbor
kknn X X prob
twoclass
multiclass
classif.knn
knn

k-Nearest Neighbor
class X twoclass
multiclass
classif.ksvm
ksvm

Support Vector Machines
kernlab X X prob
twoclass
multiclass
class.weights
Kernel parameters have to be passed directly and not by using the kpar list in ksvm. Note that fit has been set to FALSE by default for speed.
classif.lda
lda

Linear Discriminant Analysis
MASS X X prob
twoclass
multiclass
Learner parameter predict.method maps to method in predict.lda.
classif.LiblineaRL1L2SVC
liblinl1l2svc

L1-Regularized L2-Loss Support Vector Classification
LiblineaR X twoclass
multiclass
class.weights
classif.LiblineaRL1LogReg
liblinl1logreg

L1-Regularized Logistic Regression
LiblineaR X prob
twoclass
multiclass
class.weights
classif.LiblineaRL2L1SVC
liblinl2l1svc

L2-Regularized L1-Loss Support Vector Classification
LiblineaR X twoclass
multiclass
class.weights
classif.LiblineaRL2LogReg
liblinl2logreg

L2-Regularized Logistic Regression
LiblineaR X prob
twoclass
multiclass
class.weights
type = 0 (the default) is primal and type = 7 is dual problem.
classif.LiblineaRL2SVC
liblinl2svc

L2-Regularized L2-Loss Support Vector Classification
LiblineaR X twoclass
multiclass
class.weights
type = 2 (the default) is primal and type = 1 is dual problem.
classif.LiblineaRMultiClassSVC
liblinmulticlasssvc

Support Vector Classification by Crammer and Singer
LiblineaR X twoclass
multiclass
class.weights
classif.linDA
linda

Linear Discriminant Analysis
DiscriMiner X twoclass
multiclass
Set validation = NULL by default to disable internal test set validation.
classif.logreg
logreg

Logistic Regression
stats X X X prob
twoclass
Delegates to glm with family = binomial(link = 'logit'). We set ‘model’ to FALSE by default to save memory.
classif.lssvm
lssvm

Least Squares Support Vector Machine
kernlab X X twoclass
multiclass
fitted has been set to FALSE by default for speed.
classif.lvq1
lvq1

Learning Vector Quantization
class X twoclass
multiclass
classif.mda
mda

Mixture Discriminant Analysis
mda X X prob
twoclass
multiclass
keep.fitted has been set to FALSE by default for speed and we use start.method = "lvq" for more robust behavior / less technical crashes.
classif.mlp
mlp

Multi-Layer Perceptron
RSNNS X prob
twoclass
multiclass
classif.multinom
multinom

Multinomial Regression
nnet X X X prob
twoclass
multiclass
classif.naiveBayes
nbayes

Naive Bayes
e1071 X X X prob
twoclass
multiclass
classif.neuralnet
neuralnet

Neural Network from neuralnet
neuralnet X prob
twoclass
err.fct has been set to ce and linear.output to FALSE to do classification.
classif.nnet
nnet

Neural Network
nnet X X X prob
twoclass
multiclass
linout=TRUE is hardcoded for regression. size has been set to 3 by default.
classif.nnTrain
nn.train

Training Neural Network by Backpropagation
deepnet X prob
twoclass
multiclass
output set to softmax by default. max.number.of.layers can be set to control and tune the maximal number of layers specified via hidden.
classif.nodeHarvest
nodeHarvest

Node Harvest
nodeHarvest X X prob
twoclass
classif.OneR
oner

1-R Classifier
RWeka X X X prob
twoclass
multiclass
NAs are directly passed to WEKA with na.action = na.pass.
classif.pamr
pamr

Nearest shrunken centroid
pamr X prob
twoclass
Threshold for prediction (threshold.predict) has been set to 1 by default.
classif.PART
part

PART Decision Lists
RWeka X X X prob
twoclass
multiclass
NAs are directly passed to WEKA with na.action = na.pass.
classif.penalized
penalized

Penalized Logistic Regression
penalized X X X prob
twoclass
trace=FALSE was set by default to disable logging output.
classif.plr
plr

Logistic Regression with a L2 Penalty
stepPlr X X X prob
twoclass
AIC and BIC penalty types can be selected via the new parameter cp.type.
classif.plsdaCaret
plsdacaret

Partial Least Squares (PLS) Discriminant Analysis
caret
pls
X prob
twoclass
multiclass
classif.probit
probit

Probit Regression
stats X X X prob
twoclass
Delegates to glm with family = binomial(link = 'probit'). We set ‘model’ to FALSE by default to save memory.
classif.qda
qda

Quadratic Discriminant Analysis
MASS X X prob
twoclass
multiclass
Learner parameter predict.method maps to method in predict.qda.
classif.quaDA
quada

Quadratic Discriminant Analysis
DiscriMiner X twoclass
multiclass
classif.randomForest
rf

Random Forest
randomForest X X X prob
twoclass
multiclass
class.weights
featimp
oobpreds
Note that the rf can freeze the R process if trained on a task with 1 feature which is constant. This can happen in feature forward selection, also due to resampling, and you need to remove such features with removeConstantFeatures.
classif.randomForestSRC
rfsrc

Random Forest
randomForestSRC X X X X X prob
twoclass
multiclass
featimp
oobpreds
na.action has been set to "na.impute" by default to allow missing data support.
classif.ranger
ranger

Random Forests
ranger X X X X prob
twoclass
multiclass
featimp
oobpreds
By default, internal parallelization is switched off (num.threads = 1), verbose output is disabled, respect.unordered.factors is set to order for all splitrules. If predict.type=‘prob’ we set ‘probability=TRUE’ in ranger.
classif.rda
rda

Regularized Discriminant Analysis
klaR X X prob
twoclass
multiclass
estimate.error has been set to FALSE by default for speed.
classif.rFerns
rFerns

Random ferns
rFerns X X X twoclass
multiclass
oobpreds
classif.rknn
rknn

Random k-Nearest-Neighbors
rknn X X twoclass
multiclass
k restricted to < 99 as the code allocates arrays of static size
classif.rotationForest
rotationForest

Rotation Forest
rotationForest X X X prob
twoclass
classif.rpart
rpart

Decision Tree
rpart X X X X X prob
twoclass
multiclass
featimp
xval has been set to 0 by default for speed.
classif.RRF
RRF

Regularized Random Forests
RRF X X prob
twoclass
multiclass
featimp
classif.rrlda
rrlda

Robust Regularized Linear Discriminant Analysis
rrlda X twoclass
multiclass
classif.saeDNN
sae.dnn

Deep neural network with weights initialized by Stacked AutoEncoder
deepnet X prob
twoclass
multiclass
output set to "softmax" by default.
classif.sda
sda

Shrinkage Discriminant Analysis
sda X prob
twoclass
multiclass
classif.sparseLDA
sparseLDA

Sparse Discriminant Analysis
sparseLDA
MASS
elasticnet
X prob
twoclass
multiclass
Arguments Q and stop are not yet provided as they depend on the task.
classif.svm
svm

Support Vector Machines (libsvm)
e1071 X X prob
twoclass
multiclass
class.weights
classif.xgboost
xgboost

eXtreme Gradient Boosting
xgboost X X X prob
twoclass
multiclass
featimp
All settings are passed directly, rather than through xgboost’s params argument. nrounds has been set to 1 and verbose to 0 by default. num_class is set internally, so do not set this manually.

Regression (59)

Additional learner properties:

  • se: Standard errors can be predicted.
Class / Short Name / Name Packages Num. Fac. Ord. NAs Weights Props Note
regr.bartMachine
bartmachine

Bayesian Additive Regression Trees
bartMachine X X X use_missing_data has been set to TRUE by default to allow missing data support.
regr.bcart
bcart

Bayesian CART
tgp X X se
regr.bgp
bgp

Bayesian Gaussian Process
tgp X se
regr.bgpllm
bgpllm

Bayesian Gaussian Process with jumps to the Limiting Linear Model
tgp X se
regr.blm
blm

Bayesian Linear Model
tgp X se
regr.brnn
brnn

Bayesian regularization for feed-forward neural networks
brnn X X
regr.bst
bst

Gradient Boosting
bst
rpart
X Renamed parameter learner to Learner due to nameclash with setHyperPars. Default changes: Learner = "ls", xval = 0, and maxdepth = 1.
regr.btgp
btgp

Bayesian Treed Gaussian Process
tgp X X se
regr.btgpllm
btgpllm

Bayesian Treed Gaussian Process with jumps to the Limiting Linear Model
tgp X X se
regr.btlm
btlm

Bayesian Treed Linear Model
tgp X X se
regr.cforest
cforest

Random Forest Based on Conditional Inference Trees
party X X X X X featimp See ?ctree_control for possible breakage for nominal features with missingness.
regr.crs
crs

Regression Splines
crs X X X se
regr.ctree
ctree

Conditional Inference Trees
party X X X X X See ?ctree_control for possible breakage for nominal features with missingness.
regr.cubist
cubist

Cubist
Cubist X X X
regr.cvglmnet
cvglmnet

GLM with Lasso or Elasticnet Regularization (Cross Validated Lambda)
glmnet X X X Factors automatically get converted to dummy columns, ordered factors to integer. glmnet uses a global control object for its parameters. mlr resets all control parameters to their defaults before setting the specified parameters and after training. If you are setting glmnet.control parameters through glmnet.control, you need to save and re-set them after running the glmnet learner.
regr.earth
earth

Multivariate Adaptive Regression Splines
earth X X
regr.evtree
evtree

Evolutionary learning of globally optimal trees
evtree X X X X pmutatemajor, pmutateminor, pcrossover, psplit, and pprune, are scaled internally to sum to 100.
regr.extraTrees
extraTrees

Extremely Randomized Trees
extraTrees X X
regr.FDboost
FDboost

Functional linear array regression boosting
FDboost
mboost
X functionals Only allow one base learner for functional covariate and one base learner for scalar covariate, the parameters for these base learners are the same. Also we currently do not support interaction between scalar covariates
regr.featureless
featureless

Featureless regression
mlr X X X X functionals
regr.fgam
FGAM

functional general additive model
refund functionals
single.functional
regr.fnn
fnn

Fast k-Nearest Neighbor
FNN X
regr.frbs
frbs

Fuzzy Rule-based Systems
frbs X
regr.gamboost
gamboost

Gradient Boosting with Smooth Components
mboost X X X
regr.gausspr
gausspr

Gaussian Processes
kernlab X X se Kernel parameters have to be passed directly and not by using the kpar list in gausspr. Note that fit has been set to FALSE by default for speed.
regr.gbm
gbm

Gradient Boosting Machine
gbm X X X X featimp keep.data is set to FALSE to reduce memory requirements, distribution has been set to "gaussian" by default.Param ‘n.cores’ has been to set to ‘1’ by default to suppress parallelization by the package.
regr.glm
glm

Generalized Linear Regression
stats X X X se ‘family’ must be a character and every family has its own link, i.e. family = ‘gaussian’, link.gaussian = ‘identity’, which is also the default. We set ‘model’ to FALSE by default to save memory.
regr.glmboost
glmboost

Boosting for GLMs
mboost X X X
regr.glmnet
glmnet

GLM with Lasso or Elasticnet Regularization
glmnet X X X X Factors automatically get converted to dummy columns, ordered factors to integer. Parameter s (value of the regularization parameter used for predictions) is set to 0.01 by default, but needs to be tuned by the user. glmnet uses a global control object for its parameters. mlr resets all control parameters to their defaults before setting the specified parameters and after training. If you are setting glmnet.control parameters through glmnet.control, you need to save and re-set them after running the glmnet learner.
regr.GPfit
GPfit

Gaussian Process
GPfit X se (1) As the optimization routine assumes that the inputs are scaled to the unit hypercube [0,1]^d, the input gets scaled for each variable by default. If this is not wanted, scale = FALSE has to be set. (2) We replace the GPfit parameter ‘corr = list(type = ’exponential’,power = 1.95)’ to be seperate parameters ‘type’ and ‘power’, in the case of corr = list(type = ‘matern’, nu = 0.5), the seperate parameters are ‘type’ and ‘matern_nu_k = 0’, and nu is computed by ‘nu = (2 * matern_nu_k + 1) / 2 = 0.5’
regr.h2o.deeplearning
h2o.dl

h2o.deeplearning
h2o X X X X The default value of missing_values_handling is "MeanImputation", so missing values are automatically mean-imputed.
regr.h2o.gbm
h2o.gbm

h2o.gbm
h2o X X X
regr.h2o.glm
h2o.glm

h2o.glm
h2o X X X X family is always set to "gaussian". The default value of missing_values_handling is "MeanImputation", so missing values are automatically mean-imputed.
regr.h2o.randomForest
h2o.rf

h2o.randomForest
h2o X X X
regr.IBk
ibk

K-Nearest Neighbours
RWeka X X
regr.kknn
kknn

K-Nearest-Neighbor regression
kknn X X
regr.km
km

Kriging
DiceKriging X se In predict, we currently always use type = "SK". The extra parameter jitter (default is FALSE) enables adding a very small jitter (order 1e-12) to the x-values before prediction, as predict.km reproduces the exact y-values of the training data points, when you pass them in, even if the nugget effect is turned on. We further introduced nugget.stability which sets the nugget to nugget.stability * var(y) before each training to improve numerical stability. We recommend a setting of 10^-8
regr.ksvm
ksvm

Support Vector Machines
kernlab X X Kernel parameters have to be passed directly and not by using the kpar list in ksvm. Note that fit has been set to FALSE by default for speed.
regr.laGP
laGP

Local Approximate Gaussian Process
laGP X se
regr.LiblineaRL2L1SVR
liblinl2l1svr

L2-Regularized L1-Loss Support Vector Regression
LiblineaR X Parameter svr_eps has been set to 0.1 by default.
regr.LiblineaRL2L2SVR
liblinl2l2svr

L2-Regularized L2-Loss Support Vector Regression
LiblineaR X type = 11 (the default) is primal and type = 12 is dual problem. Parameter svr_eps has been set to 0.1 by default.
regr.lm
lm

Simple Linear Regression
stats X X X se
regr.mars
mars

Multivariate Adaptive Regression Splines
mda X
regr.mob
mob

Model-based Recursive Partitioning Yielding a Tree with Fitted Models Associated with each Terminal Node
party
modeltools
X X X
regr.nnet
nnet

Neural Network
nnet X X X size has been set to 3 by default.
regr.nodeHarvest
nodeHarvest

Node Harvest
nodeHarvest X X
regr.pcr
pcr

Principal Component Regression
pls X X
regr.penalized
penalized

Penalized Regression
penalized X X trace=FALSE was set by default to disable logging output.
regr.plsr
plsr

Partial Least Squares Regression
pls X X
regr.randomForest
rf

Random Forest
randomForest X X X featimp
oobpreds
se
See the section about ‘regr.randomForest’ in ?makeLearner for information about se estimation. Note that the rf can freeze the R process if trained on a task with 1 feature which is constant. This can happen in feature forward selection, also due to resampling, and you need to remove such features with removeConstantFeatures. keep.inbag is NULL by default but if predict.type = ‘se’ and se.method = ‘jackknife’ (the default) then it is automatically set to TRUE.
regr.randomForestSRC
rfsrc

Random Forest
randomForestSRC X X X X X featimp
oobpreds
na.action has been set to "na.impute" by default to allow missing data support.
regr.ranger
ranger

Random Forests
ranger X X X X featimp
oobpreds
se
By default, internal parallelization is switched off (num.threads = 1), verbose output is disabled, respect.unordered.factors is set to order for all splitrules. All settings are changeable. mtry.perc sets mtry to mtry.perc*getTaskNFeats(.task). Default for mtry is the floor of square root of number of features in task. SE estimation is mc bias-corrected jackknife after bootstrap, see the section about ‘regr.randomForest’ in ?makeLearner for more details.
regr.rknn
rknn

Random k-Nearest-Neighbors
rknn X X
regr.rpart
rpart

Decision Tree
rpart X X X X X featimp xval has been set to 0 by default for speed.
regr.RRF
RRF

Regularized Random Forests
RRF X X X featimp
regr.rsm
rsm

Response Surface Regression
rsm X You select the order of the regression by using modelfun = "FO" (first order), "TWI" (two-way interactions, this is with 1st oder terms!) and "SO" (full second order).
regr.rvm
rvm

Relevance Vector Machine
kernlab X X Kernel parameters have to be passed directly and not by using the kpar list in rvm. Note that fit has been set to FALSE by default for speed.
regr.svm
svm

Support Vector Machines (libsvm)
e1071 X X
regr.xgboost
xgboost

eXtreme Gradient Boosting
xgboost X X X featimp All settings are passed directly, rather than through xgboost’s params argument. nrounds has been set to 1 and verbose to 0 by default.

Survival analysis (10)

Additional learner properties:

  • prob: Probabilities can be predicted,
  • rcens, lcens, icens: The learner can handle right, left and/or interval censored data.
Class / Short Name / Name Packages Num. Fac. Ord. NAs Weights Props Note
surv.cforest
crf

Random Forest based on Conditional Inference Trees
party
survival
X X X X X featimp See ?ctree_control for possible breakage for nominal features with missingness.
surv.coxph
coxph

Cox Proportional Hazard Model
survival X X X
surv.cvglmnet
cvglmnet

GLM with Regularization (Cross Validated Lambda)
glmnet X X X X Factors automatically get converted to dummy columns, ordered factors to integer.
surv.gamboost
gamboost

Gradient boosting with smooth components
survival
mboost
X X X X family has been set to CoxPH() by default.
surv.gbm
gbm

Gradient Boosting Machine
gbm X X X X featimp keep.data is set to FALSE to reduce memory requirements.
surv.glmboost
glmboost

Gradient Boosting with Componentwise Linear Models
survival
mboost
X X X X family has been set to CoxPH() by default.
surv.glmnet
glmnet

GLM with Regularization
glmnet X X X X Factors automatically get converted to dummy columns, ordered factors to integer.Parameter s (value of the regularization parameter used for predictions) is set to 0.1 by default, but needs to be tuned by the user. glmnet uses a global control object for its parameters. mlr resets all control parameters to their defaults before setting the specified parametersand after training. If you are setting glmnet.control parameters through glmnet.control,you need to save and re-set them after running the glmnet learner.
surv.randomForestSRC
rfsrc

Random Forest
survival
randomForestSRC
X X X X X featimp
oobpreds
na.action has been set to "na.impute" by default to allow missing data support.
surv.ranger
ranger

Random Forests
ranger X X X X featimp By default, internal parallelization is switched off (num.threads = 1), verbose output is disabled, respect.unordered.factors is set to order for all splitrules. All settings are changeable.
surv.rpart
rpart

Survival Tree
rpart X X X X X featimp xval has been set to 0 by default for speed.

Cluster analysis (10)

Additional learner properties:

  • prob: Probabilities can be predicted.
Class / Short Name / Name Packages Num. Fac. Ord. NAs Weights Props Note
cluster.cmeans
cmeans

Fuzzy C-Means Clustering
e1071
clue
X prob The predict method uses cl_predict from the clue package to compute the cluster memberships for new data. The default centers = 2 is added so the method runs without setting parameters, but this must in reality of course be changed by the user.
cluster.Cobweb
cobweb

Cobweb Clustering Algorithm
RWeka X
cluster.dbscan
dbscan

DBScan Clustering
fpc X A cluster index of NA indicates noise points. Specify method = 'dist' if the data should be interpreted as dissimilarity matrix or object. Otherwise Euclidean distances will be used.
cluster.EM
em

Expectation-Maximization Clustering
RWeka X
cluster.FarthestFirst
farthestfirst

FarthestFirst Clustering Algorithm
RWeka X
cluster.kkmeans
kkmeans

Kernel K-Means
kernlab X centers has been set to 2L by default. The nearest center in kernel distance determines cluster assignment of new data points. Kernel parameters have to be passed directly and not by using the kpar list in kkmeans
cluster.kmeans
kmeans

K-Means
stats
clue
X prob The predict method uses cl_predict from the clue package to compute the cluster memberships for new data. The default centers = 2 is added so the method runs without setting parameters, but this must in reality of course be changed by the user.
cluster.MiniBatchKmeans
MBatchKmeans

MiniBatchKmeans
ClusterR X prob Calls MiniBatchKmeans of package ClusterR. Argument clusters has default value of 2 if not provided by user.
cluster.SimpleKMeans
simplekmeans

K-Means Clustering
RWeka X
cluster.XMeans
xmeans

XMeans (k-means with automatic determination of k)
RWeka X You may have to install the XMeans Weka package: WPM('install-package', 'XMeans').

Cost-sensitive classification

For ordinary misclassification costs you can use all the standard classification methods listed above.

For example-dependent costs there are several ways to generate cost-sensitive learners from ordinary regression and classification learners. See section cost-sensitive classification and the documentation of makeCostSensClassifWrapper(), makeCostSensRegrWrapper() and makeCostSensWeightedPairsWrapper() for details.

Multilabel classification (3)

Class / Short Name / Name Packages Num. Fac. Ord. NAs Weights Props Note
multilabel.cforest
cforest

Random forest based on conditional inference trees
party X X X X X prob
multilabel.randomForestSRC
rfsrc

Random Forest
randomForestSRC X X X X prob na.action has been set to na.impute by default to allow missing data support.
multilabel.rFerns
rFerns

Random ferns
rFerns X X X

Moreover, you can use the binary relevance method to apply ordinary classification learners to the multilabel problem. See the documentation of function makeMultilabelBinaryRelevanceWrapper() and the tutorial section on multilabel classification for details.