vignettes/tutorial/devel/integrated_learners.Rmd
integrated_learners.Rmd
This page lists the learning methods already integrated in mlr
.
Columns Num., Fac., Ord., NAs, and Weights indicate if a method can cope with numerical, factor, and ordered factor predictors, if it can deal with missing values in a meaningful way (other than simply removing observations with missing values) and if observation weights are supported.
Column Props shows further properties of the learning methods specific to the type of learning task. See also RLearner()
for details.
For classification the following additional learner properties are relevant and shown in column Props:
Class / Short Name / Name  Packages  Num.  Fac.  Ord.  NAs  Weights  Props  Note 

classif.ada ada ada Boosting 
ada rpart 
X  X  prob twoclass 
xval has been set to 0 by default for speed. 

classif.adaboostm1 adaboostm1 ada Boosting M1 
RWeka  X  X  prob twoclass multiclass 
NAs are directly passed to WEKA with na.action = na.pass . 

classif.bartMachine bartmachine Bayesian Additive Regression Trees 
bartMachine  X  X  X  prob twoclass 
use_missing_data has been set to TRUE by default to allow missing data support. 

classif.binomial binomial Binomial Regression 
stats  X  X  X  prob twoclass 
Delegates to glm with freely choosable binomial link function via learner parameter link . We set ‘model’ to FALSE by default to save memory. 

classif.blackboost blackboost Gradient Boosting With Regression Trees 
mboost party 
X  X  X  X  prob twoclass 
See ?ctree_control for possible breakage for nominal features with missingness. family has been set to Binomial by default. For ‘family’ ‘AUC’ and ‘AdaExp’ probabilities cannot be predcited. 

classif.boosting adabag Adabag Boosting 
adabag rpart 
X  X  X  prob twoclass multiclass featimp 
xval has been set to 0 by default for speed. 

classif.bst bst Gradient Boosting 
bst rpart 
X  twoclass  Renamed parameter learner to Learner due to nameclash with setHyperPars . Default changes: Learner = "ls" , xval = 0 , and maxdepth = 1 . 

classif.C50 C50 C50 
C50  X  X  X  X  prob twoclass multiclass 

classif.cforest cforest Random forest based on conditional inference trees 
party  X  X  X  X  X  prob twoclass multiclass featimp 
See ?ctree_control for possible breakage for nominal features with missingness. 
classif.clusterSVM clusterSVM Clustered Support Vector Machines 
SwarmSVM LiblineaR 
X  twoclass 
centers set to 2 by default. 

classif.ctree ctree Conditional Inference Trees 
party  X  X  X  X  X  prob twoclass multiclass 
See ?ctree_control for possible breakage for nominal features with missingness. 
classif.cvglmnet cvglmnet GLM with Lasso or Elasticnet Regularization (Cross Validated Lambda) 
glmnet  X  X  X  prob twoclass multiclass 
The family parameter is set to binomial for twoclass problems and to multinomial otherwise. Factors automatically get converted to dummy columns, ordered factors to integer. glmnet uses a global control object for its parameters. mlr resets all control parameters to their defaults before setting the specified parameters and after training. If you are setting glmnet.control parameters through glmnet.control, you need to save and reset them after running the glmnet learner. 

classif.dbnDNN dbn.dnn Deep neural network with weights initialized by DBN 
deepnet  X  prob twoclass multiclass 
output set to "softmax" by default. 

classif.dcSVM dcSVM DividedConquer Support Vector Machines 
SwarmSVM e1071 
X  twoclass  
classif.earth fda Flexible Discriminant Analysis 
earth stats 
X  X  X  prob twoclass multiclass 
This learner performs flexible discriminant analysis using the earth algorithm. na.action is set to na.fail and only this is supported.  
classif.evtree evtree Evolutionary learning of globally optimal trees 
evtree  X  X  X  X  prob twoclass multiclass 
pmutatemajor , pmutateminor , pcrossover , psplit , and pprune , are scaled internally to sum to 100. 

classif.extraTrees extraTrees Extremely Randomized Trees 
extraTrees  X  X  prob twoclass multiclass 

classif.fdausc.glm fdausc.glm Generalized Linear Models classification on FDA 
fda.usc  prob twoclass multiclass functionals 
model\(C[[1]] is set to quote(classif.glm)   **classif.fdausc.kernel** <br /> *fdausc.kernel* <br /><br />Kernel classification on FDA  [fda.usc](http://www.rdocumentation.org/packages/fda.usc/)       prob<br />twoclass<br />multiclass<br />single.functional  Argument draw=FALSE is used as default.   **classif.fdausc.knn** <br /> *fdausc.knn* <br /><br />fdausc.knn  [fda.usc](http://www.rdocumentation.org/packages/fda.usc/)      X  prob<br />twoclass<br />multiclass<br />single.functional  Argument draw=FALSE is used as default.   **classif.fdausc.np** <br /> *fdausc.np* <br /><br />Nonparametric classification on FDA  [fda.usc](http://www.rdocumentation.org/packages/fda.usc/)       prob<br />twoclass<br />multiclass<br />single.functional  Argument draw=FALSE is used as default. Additionally, mod\)C[[1]] is set to quote(classif.np)  
classif.featureless featureless Featureless classifier 
mlr  X  X  X  X  prob twoclass multiclass functionals 

classif.fnn fnn Fast kNearest Neighbour 
FNN  X  twoclass multiclass 

classif.gamboost gamboost Gradient boosting with smooth components 
mboost  X  X  X  prob twoclass 
family has been set to Binomial() by default. For ‘family’ ‘AUC’ and ‘AdaExp’ probabilities cannot be predicted. 

classif.gaterSVM gaterSVM Mixture of SVMs with Neural Network Gater Function 
SwarmSVM  X  twoclass 
m set to 3 and max.iter set to 1 by default. 

classif.gausspr gausspr Gaussian Processes 
kernlab  X  X  prob twoclass multiclass 
Kernel parameters have to be passed directly and not by using the kpar list in gausspr . Note that fit has been set to FALSE by default for speed. 

classif.gbm gbm Gradient Boosting Machine 
gbm  X  X  X  X  prob twoclass multiclass featimp 
keep.data is set to FALSE to reduce memory requirements. Note on param ‘distribution’: gbm will select ‘bernoulli’ by default for 2 classes, and ‘multinomial’ for multiclass problems. The latter is the only setting that works for > 2 classes. 

classif.geoDA geoda Geometric Predictive Discriminant Analysis 
DiscriMiner  X  twoclass multiclass 

classif.glmboost glmboost Boosting for GLMs 
mboost  X  X  X  prob twoclass 
family has been set to Binomial by default. For ‘family’ ‘AUC’ and ‘AdaExp’ probabilities cannot be predcited. 

classif.glmnet glmnet GLM with Lasso or Elasticnet Regularization 
glmnet  X  X  X  prob twoclass multiclass 
The family parameter is set to binomial for twoclass problems and to multinomial otherwise. Factors automatically get converted to dummy columns, ordered factors to integer. Parameter s (value of the regularization parameter used for predictions) is set to 0.1 by default, but needs to be tuned by the user. glmnet uses a global control object for its parameters. mlr resets all control parameters to their defaults before setting the specified parameters and after training. If you are setting glmnet.control parameters through glmnet.control, you need to save and reset them after running the glmnet learner. 

classif.h2o.deeplearning h2o.dl h2o.deeplearning 
h2o  X  X  X  X  prob twoclass multiclass 
The default value of missing_values_handling is "MeanImputation" , so missing values are automatically meanimputed. 

classif.h2o.gbm h2o.gbm h2o.gbm 
h2o  X  X  X  prob twoclass multiclass 
‘distribution’ is set automatically to ‘gaussian’.  
classif.h2o.glm h2o.glm h2o.glm 
h2o  X  X  X  X  prob twoclass 
family is always set to "binomial" to get a binary classifier. The default value of missing_values_handling is "MeanImputation" , so missing values are automatically meanimputed. 

classif.h2o.randomForest h2o.rf h2o.randomForest 
h2o  X  X  X  prob twoclass multiclass 

classif.IBk ibk kNearest Neighbours 
RWeka  X  X  prob twoclass multiclass 

classif.J48 j48 J48 Decision Trees 
RWeka  X  X  X  prob twoclass multiclass 
NAs are directly passed to WEKA with na.action = na.pass . 

classif.JRip jrip Propositional Rule Learner 
RWeka  X  X  X  prob twoclass multiclass 
NAs are directly passed to WEKA with na.action = na.pass . 

classif.kknn kknn kNearest Neighbor 
kknn  X  X  prob twoclass multiclass 

classif.knn knn kNearest Neighbor 
class  X  twoclass multiclass 

classif.ksvm ksvm Support Vector Machines 
kernlab  X  X  prob twoclass multiclass class.weights 
Kernel parameters have to be passed directly and not by using the kpar list in ksvm . Note that fit has been set to FALSE by default for speed. 

classif.lda lda Linear Discriminant Analysis 
MASS  X  X  prob twoclass multiclass 
Learner parameter predict.method maps to method in predict.lda . 

classif.LiblineaRL1L2SVC liblinl1l2svc L1Regularized L2Loss Support Vector Classification 
LiblineaR  X  twoclass multiclass class.weights 

classif.LiblineaRL1LogReg liblinl1logreg L1Regularized Logistic Regression 
LiblineaR  X  prob twoclass multiclass class.weights 

classif.LiblineaRL2L1SVC liblinl2l1svc L2Regularized L1Loss Support Vector Classification 
LiblineaR  X  twoclass multiclass class.weights 

classif.LiblineaRL2LogReg liblinl2logreg L2Regularized Logistic Regression 
LiblineaR  X  prob twoclass multiclass class.weights 
type = 0 (the default) is primal and type = 7 is dual problem. 

classif.LiblineaRL2SVC liblinl2svc L2Regularized L2Loss Support Vector Classification 
LiblineaR  X  twoclass multiclass class.weights 
type = 2 (the default) is primal and type = 1 is dual problem. 

classif.LiblineaRMultiClassSVC liblinmulticlasssvc Support Vector Classification by Crammer and Singer 
LiblineaR  X  twoclass multiclass class.weights 

classif.linDA linda Linear Discriminant Analysis 
DiscriMiner  X  twoclass multiclass 
Set validation = NULL by default to disable internal test set validation. 

classif.logreg logreg Logistic Regression 
stats  X  X  X  prob twoclass 
Delegates to glm with family = binomial(link = 'logit') . We set ‘model’ to FALSE by default to save memory. 

classif.lqa lqa Fitting penalized Generalized Linear Models with the LQA algorithm 
lqa  X  prob twoclass 
penalty has been set to "lasso" and lambda to 0.1 by default. The parameters lambda , gamma , alpha , oscar.c , a , lambda1 and lambda2 are the tuning parameters of the penalty function being used, and correspond to the parameters as named in the respective help files. Parameter c for penalty method oscar has been named oscar.c . Parameters lambda1 and lambda2 correspond to the parameters named ‘lambda_1’ and ‘lambda_2’ of the penalty functions enet , fused.lasso , icb , licb , as well as weighted.fusion . 

classif.lssvm lssvm Least Squares Support Vector Machine 
kernlab  X  X  twoclass multiclass 
fitted has been set to FALSE by default for speed. 

classif.lvq1 lvq1 Learning Vector Quantization 
class  X  twoclass multiclass 

classif.mda mda Mixture Discriminant Analysis 
mda  X  X  prob twoclass multiclass 
keep.fitted has been set to FALSE by default for speed and we use start.method = "lvq" for more robust behavior / less technical crashes. 

classif.mlp mlp MultiLayer Perceptron 
RSNNS  X  prob twoclass multiclass 

classif.multinom multinom Multinomial Regression 
nnet  X  X  X  prob twoclass multiclass 

classif.naiveBayes nbayes Naive Bayes 
e1071  X  X  X  prob twoclass multiclass 

classif.neuralnet neuralnet Neural Network from neuralnet 
neuralnet  X  prob twoclass 
err.fct has been set to ce and linear.output to FALSE to do classification. 

classif.nnet nnet Neural Network 
nnet  X  X  X  prob twoclass multiclass 
linout=TRUE is hardcoded for regression. size has been set to 3 by default. 

classif.nnTrain nn.train Training Neural Network by Backpropagation 
deepnet  X  prob twoclass multiclass 
output set to softmax by default. max.number.of.layers can be set to control and tune the maximal number of layers specified via hidden . 

classif.nodeHarvest nodeHarvest Node Harvest 
nodeHarvest  X  X  prob twoclass 

classif.OneR oner 1R Classifier 
RWeka  X  X  X  prob twoclass multiclass 
NAs are directly passed to WEKA with na.action = na.pass . 

classif.pamr pamr Nearest shrunken centroid 
pamr  X  prob twoclass 
Threshold for prediction (threshold.predict ) has been set to 1 by default. 

classif.PART part PART Decision Lists 
RWeka  X  X  X  prob twoclass multiclass 
NAs are directly passed to WEKA with na.action = na.pass . 

classif.penalized penalized Penalized Logistic Regression 
penalized  X  X  X  prob twoclass 
trace=FALSE was set by default to disable logging output.  
classif.plr plr Logistic Regression with a L2 Penalty 
stepPlr  X  X  X  prob twoclass 
AIC and BIC penalty types can be selected via the new parameter cp.type . 

classif.plsdaCaret plsdacaret Partial Least Squares (PLS) Discriminant Analysis 
caret pls 
X  prob twoclass 

classif.probit probit Probit Regression 
stats  X  X  X  prob twoclass 
Delegates to glm with family = binomial(link = 'probit') . We set ‘model’ to FALSE by default to save memory. 

classif.qda qda Quadratic Discriminant Analysis 
MASS  X  X  prob twoclass multiclass 
Learner parameter predict.method maps to method in predict.qda . 

classif.quaDA quada Quadratic Discriminant Analysis 
DiscriMiner  X  twoclass multiclass 

classif.randomForest rf Random Forest 
randomForest  X  X  X  prob twoclass multiclass class.weights featimp oobpreds 
Note that the rf can freeze the R process if trained on a task with 1 feature which is constant. This can happen in feature forward selection, also due to resampling, and you need to remove such features with removeConstantFeatures.  
classif.randomForestSRC rfsrc Random Forest 
randomForestSRC  X  X  X  X  X  prob twoclass multiclass featimp oobpreds 
na.action has been set to "na.impute" by default to allow missing data support. 
classif.ranger ranger Random Forests 
ranger  X  X  X  X  prob twoclass multiclass featimp oobpreds 
By default, internal parallelization is switched off (num.threads = 1 ), verbose output is disabled, respect.unordered.factors is set to order for all splitrules. All settings are changeable. mtry.perc sets mtry to mtry.perc*getTaskNFeats(.task) . Default for mtry is the floor of square root of number of features in task. Default for min.node.size is 1 for classification and 10 for probability estimation. 

classif.rda rda Regularized Discriminant Analysis 
klaR  X  X  prob twoclass multiclass 
estimate.error has been set to FALSE by default for speed. 

classif.rFerns rFerns Random ferns 
rFerns  X  X  X  twoclass multiclass oobpreds 

classif.rknn rknn Random kNearestNeighbors 
rknn  X  X  twoclass multiclass 
k restricted to < 99 as the code allocates arrays of static size  
classif.rotationForest rotationForest Rotation Forest 
rotationForest  X  X  X  prob twoclass 

classif.rpart rpart Decision Tree 
rpart  X  X  X  X  X  prob twoclass multiclass featimp 
xval has been set to 0 by default for speed. 
classif.RRF RRF Regularized Random Forests 
RRF  X  X  prob twoclass multiclass featimp 

classif.rrlda rrlda Robust Regularized Linear Discriminant Analysis 
rrlda  X  twoclass multiclass 

classif.saeDNN sae.dnn Deep neural network with weights initialized by Stacked AutoEncoder 
deepnet  X  prob twoclass multiclass 
output set to "softmax" by default. 

classif.sda sda Shrinkage Discriminant Analysis 
sda  X  prob twoclass multiclass 

classif.sparseLDA sparseLDA Sparse Discriminant Analysis 
sparseLDA MASS elasticnet 
X  prob twoclass multiclass 
Arguments Q and stop are not yet provided as they depend on the task. 

classif.svm svm Support Vector Machines (libsvm) 
e1071  X  X  prob twoclass multiclass class.weights 

classif.xgboost xgboost eXtreme Gradient Boosting 
xgboost  X  X  X  prob twoclass multiclass featimp 
All settings are passed directly, rather than through xgboost ’s params argument. nrounds has been set to 1 and verbose to 0 by default. num_class is set internally, so do not set this manually. 
Additional learner properties:
Class / Short Name / Name  Packages  Num.  Fac.  Ord.  NAs  Weights  Props  Note 

regr.bartMachine bartmachine Bayesian Additive Regression Trees 
bartMachine  X  X  X 
use_missing_data has been set to TRUE by default to allow missing data support. 

regr.bcart bcart Bayesian CART 
tgp  X  X  se  
regr.bgp bgp Bayesian Gaussian Process 
tgp  X  se  
regr.bgpllm bgpllm Bayesian Gaussian Process with jumps to the Limiting Linear Model 
tgp  X  se  
regr.blackboost blackboost Gradient Boosting with Regression Trees 
mboost party 
X  X  X  X  See ?ctree_control for possible breakage for nominal features with missingness. 

regr.blm blm Bayesian Linear Model 
tgp  X  se  
regr.brnn brnn Bayesian regularization for feedforward neural networks 
brnn  X  X  
regr.bst bst Gradient Boosting 
bst rpart 
X  Renamed parameter learner to Learner due to nameclash with setHyperPars . Default changes: Learner = "ls" , xval = 0 , and maxdepth = 1 . 

regr.btgp btgp Bayesian Treed Gaussian Process 
tgp  X  X  se  
regr.btgpllm btgpllm Bayesian Treed Gaussian Process with jumps to the Limiting Linear Model 
tgp  X  X  se  
regr.btlm btlm Bayesian Treed Linear Model 
tgp  X  X  se  
regr.cforest cforest Random Forest Based on Conditional Inference Trees 
party  X  X  X  X  X  featimp  See ?ctree_control for possible breakage for nominal features with missingness. 
regr.crs crs Regression Splines 
crs  X  X  X  se  
regr.ctree ctree Conditional Inference Trees 
party  X  X  X  X  X  See ?ctree_control for possible breakage for nominal features with missingness. 

regr.cubist cubist Cubist 
Cubist  X  X  X  
regr.cvglmnet cvglmnet GLM with Lasso or Elasticnet Regularization (Cross Validated Lambda) 
glmnet  X  X  X  Factors automatically get converted to dummy columns, ordered factors to integer. glmnet uses a global control object for its parameters. mlr resets all control parameters to their defaults before setting the specified parameters and after training. If you are setting glmnet.control parameters through glmnet.control, you need to save and reset them after running the glmnet learner.  
regr.earth earth Multivariate Adaptive Regression Splines 
earth  X  X  
regr.elmNN elmNN Extreme Learning Machine for Single Hidden Layer Feedforward Neural Networks 
elmNN  X 
nhid has been set to 1 and actfun has been set to "sig" by default. 

regr.evtree evtree Evolutionary learning of globally optimal trees 
evtree  X  X  X  X 
pmutatemajor , pmutateminor , pcrossover , psplit , and pprune , are scaled internally to sum to 100. 

regr.extraTrees extraTrees Extremely Randomized Trees 
extraTrees  X  X  
regr.FDboost FDboost Functional linear array regression boosting 
FDboost mboost 
X  functionals  Only allow one base learner for functional covariate and one base learner for scalar covariate, the parameters for these base learners are the same. Also we currently do not support interaction between scalar covariates  
regr.featureless featureless Featureless regression 
mlr  X  X  X  X  functionals  
regr.fnn fnn Fast kNearest Neighbor 
FNN  X  
regr.frbs frbs Fuzzy Rulebased Systems 
frbs  X  
regr.gamboost gamboost Gradient Boosting with Smooth Components 
mboost  X  X  X  
regr.gausspr gausspr Gaussian Processes 
kernlab  X  X  se  Kernel parameters have to be passed directly and not by using the kpar list in gausspr . Note that fit has been set to FALSE by default for speed. 

regr.gbm gbm Gradient Boosting Machine 
gbm  X  X  X  X  featimp 
keep.data is set to FALSE to reduce memory requirements, distribution has been set to "gaussian" by default. 

regr.glm glm Generalized Linear Regression 
stats  X  X  X  se  ‘family’ must be a character and every family has its own link, i.e. family = ‘gaussian’, link.gaussian = ‘identity’, which is also the default. We set ‘model’ to FALSE by default to save memory.  
regr.glmboost glmboost Boosting for GLMs 
mboost  X  X  X  
regr.glmnet glmnet GLM with Lasso or Elasticnet Regularization 
glmnet  X  X  X  X  Factors automatically get converted to dummy columns, ordered factors to integer. Parameter s (value of the regularization parameter used for predictions) is set to 0.1 by default, but needs to be tuned by the user. glmnet uses a global control object for its parameters. mlr resets all control parameters to their defaults before setting the specified parameters and after training. If you are setting glmnet.control parameters through glmnet.control, you need to save and reset them after running the glmnet learner. 

regr.GPfit GPfit Gaussian Process 
GPfit  X  se  (1) As the optimization routine assumes that the inputs are scaled to the unit hypercube [0,1]^d, the input gets scaled for each variable by default. If this is not wanted, scale = FALSE has to be set. (2) We replace the GPfit parameter ‘corr = list(type = ’exponential’,power = 1.95)‘to be seperate parameters ’type’ and ‘power’, in the case of corr = list(type = ‘matern’, nu = 0.5), the seperate parameters are ‘type’ and ‘matern_nu_k = 0’, and nu is computed by ‘nu = (2 * matern_nu_k + 1) / 2 = 0.5’  
regr.h2o.deeplearning h2o.dl h2o.deeplearning 
h2o  X  X  X  X  The default value of missing_values_handling is "MeanImputation" , so missing values are automatically meanimputed. 

regr.h2o.gbm h2o.gbm h2o.gbm 
h2o  X  X  X  ‘distribution’ is set automatically to ‘gaussian’.  
regr.h2o.glm h2o.glm h2o.glm 
h2o  X  X  X  X 
family is always set to "gaussian" . The default value of missing_values_handling is "MeanImputation" , so missing values are automatically meanimputed. 

regr.h2o.randomForest h2o.rf h2o.randomForest 
h2o  X  X  X  
regr.IBk ibk KNearest Neighbours 
RWeka  X  X  
regr.kknn kknn KNearestNeighbor regression 
kknn  X  X  
regr.km km Kriging 
DiceKriging  X  se  In predict, we currently always use type = "SK" . The extra parameter jitter (default is FALSE ) enables adding a very small jitter (order 1e12) to the xvalues before prediction, as predict.km reproduces the exact yvalues of the training data points, when you pass them in, even if the nugget effect is turned on. We further introduced nugget.stability which sets the nugget to nugget.stability * var(y) before each training to improve numerical stability. We recommend a setting of 10^8 

regr.ksvm ksvm Support Vector Machines 
kernlab  X  X  Kernel parameters have to be passed directly and not by using the kpar list in ksvm . Note that fit has been set to FALSE by default for speed. 

regr.laGP laGP Local Approximate Gaussian Process 
laGP  X  se  
regr.LiblineaRL2L1SVR liblinl2l1svr L2Regularized L1Loss Support Vector Regression 
LiblineaR  X  Parameter svr_eps has been set to 0.1 by default. 

regr.LiblineaRL2L2SVR liblinl2l2svr L2Regularized L2Loss Support Vector Regression 
LiblineaR  X 
type = 11 (the default) is primal and type = 12 is dual problem. Parameter svr_eps has been set to 0.1 by default. 

regr.lm lm Simple Linear Regression 
stats  X  X  X  se  
regr.mars mars Multivariate Adaptive Regression Splines 
mda  X  
regr.mob mob Modelbased Recursive Partitioning Yielding a Tree with Fitted Models Associated with each Terminal Node 
party modeltools 
X  X  X  
regr.nnet nnet Neural Network 
nnet  X  X  X 
size has been set to 3 by default. 

regr.nodeHarvest nodeHarvest Node Harvest 
nodeHarvest  X  X  
regr.pcr pcr Principal Component Regression 
pls  X  X  
regr.penalized penalized Penalized Regression 
penalized  X  X  trace=FALSE was set by default to disable logging output.  
regr.plsr plsr Partial Least Squares Regression 
pls  X  X  
regr.randomForest rf Random Forest 
randomForest  X  X  X  featimp oobpreds se 
See ?regr.randomForest for information about se estimation. Note that the rf can freeze the R process if trained on a task with 1 feature which is constant. This can happen in feature forward selection, also due to resampling, and you need to remove such features with removeConstantFeatures. keep.inbag is NULL by default but if predict.type = ‘se’ and se.method = ‘jackknife’ (the default) then it is automatically set to TRUE. 

regr.randomForestSRC rfsrc Random Forest 
randomForestSRC  X  X  X  X  X  featimp oobpreds 
na.action has been set to "na.impute" by default to allow missing data support. 
regr.ranger ranger Random Forests 
ranger  X  X  X  X  featimp oobpreds se 
By default, internal parallelization is switched off (num.threads = 1 ), verbose output is disabled, respect.unordered.factors is set to order for all splitrules. All settings are changeable. mtry.perc sets mtry to mtry.perc*getTaskNFeats(.task) . Default for mtry is the floor of square root of number of features in task. 

regr.rknn rknn Random kNearestNeighbors 
rknn  X  X  
regr.rpart rpart Decision Tree 
rpart  X  X  X  X  X  featimp 
xval has been set to 0 by default for speed. 
regr.RRF RRF Regularized Random Forests 
RRF  X  X  X  featimp  
regr.rsm rsm Response Surface Regression 
rsm  X  You select the order of the regression by using modelfun = "FO" (first order), "TWI" (twoway interactions, this is with 1st oder terms!) and "SO" (full second order). 

regr.rvm rvm Relevance Vector Machine 
kernlab  X  X  Kernel parameters have to be passed directly and not by using the kpar list in rvm . Note that fit has been set to FALSE by default for speed. 

regr.slim slim Sparse Linear Regression using Nonsmooth Loss Functions and L1 Regularization 
flare  X 
lambda.idx has been set to 3 by default. 

regr.svm svm Support Vector Machines (libsvm) 
e1071  X  X  
regr.xgboost xgboost eXtreme Gradient Boosting 
xgboost  X  X  X  featimp  All settings are passed directly, rather than through xgboost ’s params argument. nrounds has been set to 1 and verbose to 0 by default. 
Additional learner properties:
Class / Short Name / Name  Packages  Num.  Fac.  Ord.  NAs  Weights  Props  Note 

surv.cforest crf Random Forest based on Conditional Inference Trees 
party survival 
X  X  X  X  X  featimp  See ?ctree_control for possible breakage for nominal features with missingness. 
surv.CoxBoost coxboost Cox Proportional Hazards Model with Componentwise Likelihood based Boosting 
CoxBoost  X  X  X  X  Factors automatically get converted to dummy columns, ordered factors to integer.  
surv.coxph coxph Cox Proportional Hazard Model 
survival  X  X  X  
surv.cv.CoxBoost cv.CoxBoost Cox Proportional Hazards Model with Componentwise Likelihood based Boosting, tuned for the optimal number of boosting steps 
CoxBoost  X  X  X  Factors automatically get converted to dummy columns, ordered factors to integer.  
surv.cvglmnet cvglmnet GLM with Regularization (Cross Validated Lambda) 
glmnet  X  X  X  X  Factors automatically get converted to dummy columns, ordered factors to integer.  
surv.gamboost gamboost Gradient boosting with smooth components 
survival mboost 
X  X  X  X 
family has been set to CoxPH() by default. 

surv.gbm gbm Gradient Boosting Machine 
gbm  X  X  X  X  featimp 
keep.data is set to FALSE to reduce memory requirements. 

surv.glmboost glmboost Gradient Boosting with Componentwise Linear Models 
survival mboost 
X  X  X  X 
family has been set to CoxPH() by default. 

surv.glmnet glmnet GLM with Regularization 
glmnet  X  X  X  X  Factors automatically get converted to dummy columns, ordered factors to integer. Parameter s (value of the regularization parameter used for predictions) is set to 0.1 by default, but needs to be tuned by the user. glmnet uses a global control object for its parameters. mlr resets all control parameters to their defaults before setting the specified parameters and after training. If you are setting glmnet.control parameters through glmnet.control, you need to save and reset them after running the glmnet learner. 

surv.randomForestSRC rfsrc Random Forest 
survival randomForestSRC 
X  X  X  X  X  featimp oobpreds 
na.action has been set to "na.impute" by default to allow missing data support. 
surv.ranger ranger Random Forests 
ranger  X  X  X  featimp  By default, internal parallelization is switched off (num.threads = 1 ), verbose output is disabled, respect.unordered.factors is set to order for all splitrules. All settings are changeable. 

surv.rpart rpart Survival Tree 
rpart  X  X  X  X  X  featimp 
xval has been set to 0 by default for speed. 
Additional learner properties:
Class / Short Name / Name  Packages  Num.  Fac.  Ord.  NAs  Weights  Props  Note 

cluster.cmeans cmeans Fuzzy CMeans Clustering 
e1071 clue 
X  prob  The predict method uses cl_predict from the clue package to compute the cluster memberships for new data. The default centers = 2 is added so the method runs without setting parameters, but this must in reality of course be changed by the user. 

cluster.Cobweb cobweb Cobweb Clustering Algorithm 
RWeka  X  
cluster.dbscan dbscan DBScan Clustering 
fpc  X  A cluster index of NA indicates noise points. Specify method = 'dist' if the data should be interpreted as dissimilarity matrix or object. Otherwise Euclidean distances will be used. 

cluster.EM em ExpectationMaximization Clustering 
RWeka  X  
cluster.FarthestFirst farthestfirst FarthestFirst Clustering Algorithm 
RWeka  X  
cluster.kkmeans kkmeans Kernel KMeans 
kernlab  X 
centers has been set to 2L by default. The nearest center in kernel distance determines cluster assignment of new data points. Kernel parameters have to be passed directly and not by using the kpar list in kkmeans


cluster.kmeans kmeans KMeans 
stats clue 
X  prob  The predict method uses cl_predict from the clue package to compute the cluster memberships for new data. The default centers = 2 is added so the method runs without setting parameters, but this must in reality of course be changed by the user. 

cluster.SimpleKMeans simplekmeans KMeans Clustering 
RWeka  X  
cluster.XMeans xmeans XMeans (kmeans with automatic determination of k) 
RWeka  X  You may have to install the XMeans Weka package: WPM('installpackage', 'XMeans') . 
For ordinary misclassification costs you can use all the standard classification methods listed above.
For exampledependent costs there are several ways to generate costsensitive learners from ordinary regression and classification learners. See section costsensitive classification and the documentation of makeCostSensClassifWrapper()
, makeCostSensRegrWrapper()
and makeCostSensWeightedPairsWrapper()
for details.
Class / Short Name / Name  Packages  Num.  Fac.  Ord.  NAs  Weights  Props  Note 

multilabel.cforest cforest Random forest based on conditional inference trees 
party  X  X  X  X  X  prob  
multilabel.randomForestSRC rfsrc Random Forest 
randomForestSRC  X  X  X  X  prob 
na.action has been set to na.impute by default to allow missing data support. 

multilabel.rFerns rFerns Random ferns 
rFerns  X  X  X 
Moreover, you can use the binary relevance method to apply ordinary classification learners to the multilabel problem. See the documentation of function makeMultilabelBinaryRelevanceWrapper()
and the tutorial section on multilabel classification for details.