CostSensitive Classification
In regular classification the aim is to minimize the misclassification rate and thus all types of misclassification errors are deemed equally severe. A more general setting is costsensitive classification where the costs caused by different kinds of errors are not assumed to be equal and the objective is to minimize the expected costs.
In case of classdependent costs the costs depend on the true and predicted class label. The costs for predicting class if the true label is are usually organized into a cost matrix where is the number of classes. Naturally, it is assumed that the cost of predicting the correct class label is minimal (that is for all ).
A further generalization of this scenario are exampledependent misclassification costs where each example is coupled with an individual cost vector of length . Its th component expresses the cost of assigning to class . A realworld example is fraud detection where the costs do not only depend on the true and predicted status fraud/nonfraud, but also on the amount of money involved in each case. Naturally, the cost of predicting the true class label is assumed to be minimum. The true class labels are redundant information, as they can be easily inferred from the cost vectors. Moreover, given the cost vector, the expected costs do not depend on the true class label . The classification problem is therefore completely defined by the feature values and the corresponding cost vectors.
In the following we show ways to handle costsensitive classification problems in mlr. Some of the functionality is currently experimental, and there may be changes in the future.
Classdependent misclassification costs
There are some classification methods that can accomodate misclassification costs directly. One example is rpart.
Alternatively, we can use costinsensitive methods and manipulate the predictions or the training data in order to take misclassification costs into account. mlr supports thresholding and rebalancing.

Thresholding: The thresholds used to turn posterior probabilities into class labels are chosen such that the costs are minimized. This requires a Learner that can predict posterior probabilities. During training the costs are not taken into account.

Rebalancing: The idea is to change the proportion of the classes in the training data set in order to account for costs during training, either by weighting or by sampling. Rebalancing does not require that the Learner can predict probabilities.
i. For weighting we need a Learner that supports class weights or observation weights.
ii. If the Learner cannot deal with weights the proportion of classes can be changed by over and undersampling.
We start with binary classification problems and afterwards deal with multiclass problems.
Binary classification problems
The positive and negative classes are labeled and , respectively, and we consider the following cost matrix where the rows indicate true classes and the columns predicted classes:
true/pred.  
Often, the diagonal entries are zero or the cost matrix is rescaled to achieve zeros in the diagonal (see for example O'Brien et al, 2008).
A wellknown costsensitive classification problem is posed by the German Credit data set (see also the UCI Machine Learning Repository). The corresponding cost matrix (though Elkan (2001) argues that this matrix is economically unreasonable) is given as:
true/pred.  Bad  Good 
Bad  0  5 
Good  1  0 
As in the table above, the rows indicate true and the columns predicted classes.
In case of classdependent costs it is sufficient to generate an ordinary ClassifTask. A CostSensTask is only needed if the costs are exampledependent. In the R code below we create the ClassifTask, remove two constant features from the data set and generate the cost matrix. Per default, Bad is the positive class.
data(GermanCredit, package = "caret")
credit.task = makeClassifTask(data = GermanCredit, target = "Class")
credit.task = removeConstantFeatures(credit.task)
#> Removing 2 columns: Purpose.Vacation,Personal.Female.Single
credit.task
#> Supervised task: GermanCredit
#> Type: classif
#> Target: Class
#> Observations: 1000
#> Features:
#> numerics factors ordered
#> 59 0 0
#> Missings: FALSE
#> Has weights: FALSE
#> Has blocking: FALSE
#> Classes: 2
#> Bad Good
#> 300 700
#> Positive class: Bad
costs = matrix(c(0, 1, 5, 0), 2)
colnames(costs) = rownames(costs) = getTaskClassLevels(credit.task)
costs
#> Bad Good
#> Bad 0 5
#> Good 1 0
1. Thresholding
We start by fitting a logistic regression model to the German credit data set and predict posterior probabilities.
## Train and predict posterior probabilities
lrn = makeLearner("classif.multinom", predict.type = "prob", trace = FALSE)
mod = train(lrn, credit.task)
pred = predict(mod, task = credit.task)
pred
#> Prediction: 1000 observations
#> predict.type: prob
#> threshold: Bad=0.50,Good=0.50
#> time: 0.01
#> id truth prob.Bad prob.Good response
#> 1 1 Good 0.03525092 0.9647491 Good
#> 2 2 Bad 0.63222363 0.3677764 Bad
#> 3 3 Good 0.02807414 0.9719259 Good
#> 4 4 Good 0.25182703 0.7481730 Good
#> 5 5 Bad 0.75193275 0.2480673 Bad
#> 6 6 Good 0.26230149 0.7376985 Good
#> ... (1000 rows, 5 cols)
The default thresholds for both classes are 0.5. But according to the cost matrix we should predict class Good only if we are very sure that Good is indeed the correct label. Therefore we should increase the threshold for class Good and decrease the threshold for class Bad.
i. Theoretical thresholding
The theoretical threshold for the positive class can be calculated from the cost matrix as For more details see Elkan (2001).
Below the theoretical threshold for the German credit example is calculated and used to predict class labels. Since the diagonal of the cost matrix is zero the formula given above simplifies accordingly.
## Calculate the theoretical threshold for the positive class
th = costs[2,1]/(costs[2,1] + costs[1,2])
th
#> [1] 0.1666667
As you may recall you can change thresholds in mlr either before training by using the
predict.threshold
option of makeLearner or after prediction by calling setThreshold
on the Prediction object.
As we already have a prediction we use the setThreshold function. It returns an altered Prediction object with class predictions for the theoretical threshold.
## Predict class labels according to the theoretical threshold
pred.th = setThreshold(pred, th)
pred.th
#> Prediction: 1000 observations
#> predict.type: prob
#> threshold: Bad=0.17,Good=0.83
#> time: 0.01
#> id truth prob.Bad prob.Good response
#> 1 1 Good 0.03525092 0.9647491 Good
#> 2 2 Bad 0.63222363 0.3677764 Bad
#> 3 3 Good 0.02807414 0.9719259 Good
#> 4 4 Good 0.25182703 0.7481730 Bad
#> 5 5 Bad 0.75193275 0.2480673 Bad
#> 6 6 Good 0.26230149 0.7376985 Bad
#> ... (1000 rows, 5 cols)
In order to calculate the average costs over the entire data set we first need to create a new performance Measure. This can be done through function makeCostMeasure. It is expected that the rows of the cost matrix indicate true and the columns predicted class labels.
credit.costs = makeCostMeasure(id = "credit.costs", name = "Credit costs", costs = costs,
best = 0, worst = 5)
credit.costs
#> Name: Credit costs
#> Performance measure: credit.costs
#> Properties: classif,classif.multi,req.pred,req.truth,predtype.response,predtype.prob
#> Minimize: TRUE
#> Best: 0; Worst: 5
#> Aggregated by: test.mean
#> Arguments: <unnamed>=<matrix>, <unnamed>=<function>
#> Note:
Then the average costs can be computed by function performance. Below we compare the average costs and the error rate (mmce) of the learning algorithm with both default thresholds 0.5 and theoretical thresholds.
## Performance with default thresholds 0.5
performance(pred, measures = list(credit.costs, mmce))
#> credit.costs mmce
#> 0.774 0.214
## Performance with theoretical thresholds
performance(pred.th, measures = list(credit.costs, mmce))
#> credit.costs mmce
#> 0.478 0.346
These performance values may be overly optimistic as we used the same data set for training
and prediction, and resampling strategies should be preferred.
In the R code below we make use of the predict.threshold
argument of makeLearner to set
the threshold before doing a 3fold crossvalidation on the credit.task.
Note that we create a ResampleInstance (rin
) that is used throughout
the next several code chunks to get comparable performance values.
## Crossvalidated performance with theoretical thresholds
rin = makeResampleInstance("CV", iters = 3, task = credit.task)
lrn = makeLearner("classif.multinom", predict.type = "prob", predict.threshold = th, trace = FALSE)
r = resample(lrn, credit.task, resampling = rin, measures = list(credit.costs, mmce), show.info = FALSE)
#> Resampling: crossvalidation
#> Measures: credit.costs mmce
r
#> Resample Result
#> Task: GermanCredit
#> Learner: classif.multinom
#> Aggr perf: credit.costs.test.mean=0.5831280,mmce.test.mean=0.3630397
#> Runtime: 0.204372
If we are also interested in the crossvalidated performance for the default threshold values
we can call setThreshold on the resample prediction r$pred
.
## Crossvalidated performance with default thresholds
performance(setThreshold(r$pred, 0.5), measures = list(credit.costs, mmce))
#> credit.costs mmce
#> 0.8600427 0.2520215
Theoretical thresholding is only reliable if the predicted posterior probabilities are correct. If there is bias the thresholds have to be shifted accordingly.
Useful in this regard is function plotThreshVsPerf that you can use to plot the average costs as well as any other performance measure versus possible threshold values for the positive class in . The underlying data is generated by generateThreshVsPerfData.
The following plots show the crossvalidated costs and error rate (mmce).
The theoretical threshold th
calculated above is indicated by the vertical line.
As you can see from the lefthand plot the theoretical threshold seems a bit large.
d = generateThreshVsPerfData(r, measures = list(credit.costs, mmce))
plotThreshVsPerf(d, mark.th = th)
ii. Empirical thresholding
The idea of empirical thresholding (see Sheng and Ling, 2006) is to select costoptimal threshold values for a given learning method based on the training data. In contrast to theoretical thresholding it suffices if the estimated posterior probabilities are ordercorrect.
In order to determine optimal threshold values you can use mlr's function tuneThreshold. As tuning the threshold on the complete training data set can lead to overfitting, you should use resampling strategies. Below we perform 3fold crossvalidation and use tuneThreshold to calculate threshold values with lowest average costs over the 3 test data sets.
lrn = makeLearner("classif.multinom", predict.type = "prob", trace = FALSE)
## 3fold crossvalidation
r = resample(lrn, credit.task, resampling = rin, measures = list(credit.costs, mmce), show.info = FALSE)
#> Resampling: crossvalidation
#> Measures: credit.costs mmce
r
#> Resample Result
#> Task: GermanCredit
#> Learner: classif.multinom
#> Aggr perf: credit.costs.test.mean=0.8600427,mmce.test.mean=0.2520215
#> Runtime: 0.211691
## Tune the threshold based on the predicted probabilities on the 3 test data sets
tune.res = tuneThreshold(pred = r$pred, measure = credit.costs)
tune.res
#> $th
#> [1] 0.1874551
#>
#> $perf
#> credit.costs
#> 0.569108
tuneThreshold returns the optimal threshold value for the positive class and the corresponding performance. As expected the tuned threshold is smaller than the theoretical threshold.
2. Rebalancing
In order to minimize the average costs, observations from the less costly class should be given higher importance during training. This can be achieved by weighting the classes, provided that the learner under consideration has a 'class weights' or an 'observation weights' argument. To find out which learning methods support either type of weights have a look at the list of integrated learners in the Appendix or use listLearners.
## Learners that accept observation weights
listLearners("classif", properties = "weights")[c("class", "package")]
#> class package
#> 1 classif.binomial stats
#> 2 classif.blackboost mboost,party
#> 3 classif.C50 C50
#> 4 classif.cforest party
#> 5 classif.ctree party
#> 6 classif.cvglmnet glmnet
#> ... (24 rows, 2 cols)
## Learners that can deal with class weights
listLearners("classif", properties = "class.weights")[c("class", "package")]
#> class package
#> 1 classif.ksvm kernlab
#> 2 classif.LiblineaRL1L2SVC LiblineaR
#> 3 classif.LiblineaRL1LogReg LiblineaR
#> 4 classif.LiblineaRL2L1SVC LiblineaR
#> 5 classif.LiblineaRL2LogReg LiblineaR
#> 6 classif.LiblineaRL2SVC LiblineaR
#> ... (9 rows, 2 cols)
Alternatively, over and undersampling techniques can be used.
i. Weighting
Just as theoretical thresholds, theoretical weights can be calculated from the cost matrix. If indicates the target threshold and the original threshold for the positive class the proportion of observations in the positive class has to be multiplied by Alternatively, the proportion of observations in the negative class can be multiplied by the inverse. A proof is given by Elkan (2001).
In most cases, the original threshold is and thus the second factor vanishes. If additionally the target threshold equals the theoretical threshold the proportion of observations in the positive class has to be multiplied by
For the credit example the theoretical threshold corresponds to a weight of 5 for the positive class.
## Weight for positive class corresponding to theoretical treshold
w = (1  th)/th
w
#> [1] 5
A unified and convenient way to assign class weights to a Learner (and tune
them) is provided by function makeWeightedClassesWrapper. The class weights are specified
using argument wcw.weight
.
For learners that support observation weights a suitable weight vector is then generated
internally during training or resampling.
If the learner can deal with class weights, the weights are basically passed on to the
appropriate learner parameter. The advantage of using the wrapper in this case is the unified
way to specify the class weights.
Below is an example using learner "classif.multinom"
(multinom from
package nnet) which accepts observation weights.
For binary classification problems it is sufficient to specify the weight w
for the positive
class. The negative class then automatically receives weight 1.
## Weighted learner
lrn = makeLearner("classif.multinom", trace = FALSE)
lrn = makeWeightedClassesWrapper(lrn, wcw.weight = w)
lrn
#> Learner weightedclasses.classif.multinom from package nnet
#> Type: classif
#> Name: ; Short name:
#> Class: WeightedClassesWrapper
#> Properties: twoclass,multiclass,numerics,factors,prob
#> PredictType: response
#> Hyperparameters: trace=FALSE,wcw.weight=5
r = resample(lrn, credit.task, rin, measures = list(credit.costs, mmce), show.info = FALSE)
#> Resampling: crossvalidation
#> Measures: credit.costs mmce
r
#> Resample Result
#> Task: GermanCredit
#> Learner: weightedclasses.classif.multinom
#> Aggr perf: credit.costs.test.mean=0.5851031,mmce.test.mean=0.3530387
#> Runtime: 0.26645
For classification methods like "classif.ksvm"
(the support vector machine
ksvm in package kernlab) that support class weights you can pass them
directly.
lrn = makeLearner("classif.ksvm", class.weights = c(Bad = w, Good = 1))
Or, more conveniently, you can again use makeWeightedClassesWrapper.
lrn = makeWeightedClassesWrapper("classif.ksvm", wcw.weight = w)
r = resample(lrn, credit.task, rin, measures = list(credit.costs, mmce), show.info = FALSE)
#> Resampling: crossvalidation
#> Measures: credit.costs mmce
r
#> Resample Result
#> Task: GermanCredit
#> Learner: weightedclasses.classif.ksvm
#> Aggr perf: credit.costs.test.mean=0.6520802,mmce.test.mean=0.3360276
#> Runtime: 0.443757
Just like the theoretical threshold, the theoretical weights may not always be suitable, therefore you can tune the weight for the positive class as shown in the following example. Calculating the theoretical weight beforehand may help to narrow down the search interval.
lrn = makeLearner("classif.multinom", trace = FALSE)
lrn = makeWeightedClassesWrapper(lrn)
ps = makeParamSet(makeDiscreteParam("wcw.weight", seq(4, 12, 0.5)))
ctrl = makeTuneControlGrid()
tune.res = tuneParams(lrn, credit.task, resampling = rin, par.set = ps,
measures = list(credit.costs, mmce), control = ctrl, show.info = FALSE)
#> Resampling: crossvalidation
#> Measures: credit.costs mmce
#> Resampling: crossvalidation
#> Measures: credit.costs mmce
#> Resampling: crossvalidation
#> Measures: credit.costs mmce
#> Resampling: crossvalidation
#> Measures: credit.costs mmce
#> Resampling: crossvalidation
#> Measures: credit.costs mmce
#> Resampling: crossvalidation
#> Measures: credit.costs mmce
#> Resampling: crossvalidation
#> Measures: credit.costs mmce
#> Resampling: crossvalidation
#> Measures: credit.costs mmce
#> Resampling: crossvalidation
#> Measures: credit.costs mmce
#> Resampling: crossvalidation
#> Measures: credit.costs mmce
#> Resampling: crossvalidation
#> Measures: credit.costs mmce
#> Resampling: crossvalidation
#> Measures: credit.costs mmce
#> Resampling: crossvalidation
#> Measures: credit.costs mmce
#> Resampling: crossvalidation
#> Measures: credit.costs mmce
#> Resampling: crossvalidation
#> Measures: credit.costs mmce
#> Resampling: crossvalidation
#> Measures: credit.costs mmce
#> Resampling: crossvalidation
#> Measures: credit.costs mmce
tune.res
#> Tune result:
#> Op. pars: wcw.weight=7.5
#> credit.costs.test.mean=0.5590681,mmce.test.mean=0.3950118
as.data.frame(tune.res$opt.path)[1:3]
#> wcw.weight credit.costs.test.mean mmce.test.mean
#> 1 4 0.5961231 0.3280466
#> 2 4.5 0.5960931 0.3440327
#> 3 5 0.5851031 0.3530387
#> 4 5.5 0.5711010 0.3590327
#> 5 6 0.5761030 0.3720307
#> 6 6.5 0.5851091 0.3850287
#> 7 7 0.5830951 0.3950148
#> 8 7.5 0.5590681 0.3950118
#> 9 8 0.5700671 0.4060108
#> 10 8.5 0.5760671 0.4120108
#> 11 9 0.5780631 0.4180108
#> 12 9.5 0.5790581 0.4230098
#> 13 10 0.5850641 0.4290158
#> 14 10.5 0.5890771 0.4370209
#> 15 11 0.5770891 0.4410249
#> 16 11.5 0.5780901 0.4420259
#> 17 12 0.5770801 0.4450199
ii. Over and undersampling
If the Learner supports neither observation nor class weights the proportions of the classes in the training data can be changed by over or undersampling.
In the GermanCredit data set the positive class Bad should receive
a theoretical weight of w = (1  th)/th = 5
.
This can be achieved by oversampling class Bad with a rate
of 5 or by undersampling
class Good with a rate
of 1/5 (using functions oversample or undersample).
credit.task.over = oversample(credit.task, rate = w, cl = "Bad")
lrn = makeLearner("classif.multinom", trace = FALSE)
mod = train(lrn, credit.task.over)
pred = predict(mod, task = credit.task)
performance(pred, measures = list(credit.costs, mmce))
#> credit.costs mmce
#> 0.441 0.325
Note that in the above example the learner was trained on the oversampled task credit.task.over
.
In order to get the training performance on the original task predictions were calculated for credit.task
.
We usually prefer resampled performance values, but simply calling resample on the oversampled task does not work since predictions have to be based on the original task. The solution is to create a wrapped Learner via function makeOversampleWrapper. Internally, oversample is called before training, but predictions are done on the original data.
lrn = makeLearner("classif.multinom", trace = FALSE)
lrn = makeOversampleWrapper(lrn, osw.rate = w, osw.cl = "Bad")
lrn
#> Learner classif.multinom.oversampled from package mlr,nnet
#> Type: classif
#> Name: ; Short name:
#> Class: OversampleWrapper
#> Properties: numerics,factors,weights,prob,twoclass,multiclass
#> PredictType: response
#> Hyperparameters: trace=FALSE,osw.rate=5,osw.cl=Bad
r = resample(lrn, credit.task, rin, measures = list(credit.costs, mmce), show.info = FALSE)
#> Resampling: crossvalidation
#> Measures: credit.costs mmce
r
#> Resample Result
#> Task: GermanCredit
#> Learner: classif.multinom.oversampled
#> Aggr perf: credit.costs.test.mean=0.5681190,mmce.test.mean=0.3360426
#> Runtime: 0.419115
Of course, we can also tune the oversampling rate. For this purpose we again have to create
an OversampleWrapper.
Optimal values for parameter osw.rate
can be obtained using function tuneParams.
lrn = makeLearner("classif.multinom", trace = FALSE)
lrn = makeOversampleWrapper(lrn, osw.cl = "Bad")
ps = makeParamSet(makeDiscreteParam("osw.rate", seq(3, 7, 0.25)))
ctrl = makeTuneControlGrid()
tune.res = tuneParams(lrn, credit.task, rin, par.set = ps, measures = list(credit.costs, mmce),
control = ctrl, show.info = FALSE)
#> Resampling: crossvalidation
#> Measures: credit.costs mmce
#> Resampling: crossvalidation
#> Measures: credit.costs mmce
#> Resampling: crossvalidation
#> Measures: credit.costs mmce
#> Resampling: crossvalidation
#> Measures: credit.costs mmce
#> Resampling: crossvalidation
#> Measures: credit.costs mmce
#> Resampling: crossvalidation
#> Measures: credit.costs mmce
#> Resampling: crossvalidation
#> Measures: credit.costs mmce
#> Resampling: crossvalidation
#> Measures: credit.costs mmce
#> Resampling: crossvalidation
#> Measures: credit.costs mmce
#> Resampling: crossvalidation
#> Measures: credit.costs mmce
#> Resampling: crossvalidation
#> Measures: credit.costs mmce
#> Resampling: crossvalidation
#> Measures: credit.costs mmce
#> Resampling: crossvalidation
#> Measures: credit.costs mmce
#> Resampling: crossvalidation
#> Measures: credit.costs mmce
#> Resampling: crossvalidation
#> Measures: credit.costs mmce
#> Resampling: crossvalidation
#> Measures: credit.costs mmce
#> Resampling: crossvalidation
#> Measures: credit.costs mmce
tune.res
#> Tune result:
#> Op. pars: osw.rate=6.75
#> credit.costs.test.mean=0.5700821,mmce.test.mean=0.3860178
Multiclass problems
We consider the waveform data set from package mlbench and add an artificial cost matrix:
true/pred.  1  2  3 
1  0  30  80 
2  5  0  4 
3  10  8  0 
We start by creating the Task, the cost matrix and the corresponding performance measure.
## Task
df = mlbench::mlbench.waveform(500)
wf.task = makeClassifTask(id = "waveform", data = as.data.frame(df), target = "classes")
## Cost matrix
costs = matrix(c(0, 5, 10, 30, 0, 8, 80, 4, 0), 3)
colnames(costs) = rownames(costs) = getTaskClassLevels(wf.task)
## Performance measure
wf.costs = makeCostMeasure(id = "wf.costs", name = "Waveform costs", costs = costs,
best = 0, worst = 10)
In the multiclass case, both, thresholding and rebalancing correspond to cost matrices of a certain structure where for , , . This condition means that the cost of misclassifying an observation is independent of the predicted class label (see Domingos, 1999). Given a cost matrix of this type, theoretical thresholds and weights can be derived in a similar manner as in the binary case. Obviously, the cost matrix given above does not have this special structure.
1. Thresholding
Given a vector of positive threshold values as long as the number of classes , the predicted probabilities for all classes are adjusted by dividing them by the corresponding threshold value. Then the class with the highest adjusted probability is predicted. This way, as in the binary case, classes with a low threshold are preferred to classes with a larger threshold.
Again this can be done by function setThreshold as shown in the following example (or
alternatively by the predict.threshold
option of makeLearner).
Note that the threshold vector needs to have names that correspond to the class labels.
lrn = makeLearner("classif.rpart", predict.type = "prob")
rin = makeResampleInstance("CV", iters = 3, task = wf.task)
r = resample(lrn, wf.task, rin, measures = list(wf.costs, mmce), show.info = FALSE)
#> Resampling: crossvalidation
#> Measures: wf.costs mmce
r
#> Resample Result
#> Task: waveform
#> Learner: classif.rpart
#> Aggr perf: wf.costs.test.mean=7.9568814,mmce.test.mean=0.3180386
#> Runtime: 0.0593178
## Calculate thresholds as 1/(average costs of true classes)
th = 2/rowSums(costs)
names(th) = getTaskClassLevels(wf.task)
th
#> 1 2 3
#> 0.01818182 0.22222222 0.11111111
pred.th = setThreshold(r$pred, threshold = th)
performance(pred.th, measures = list(wf.costs, mmce))
#> wf.costs mmce
#> 6.1248707 0.4699998
The threshold vector th
in the above example is chosen according to the average costs
of the true classes 55, 4.5 and 9.
More exactly, th
corresponds to an artificial cost matrix of the structure mentioned
above with offdiagonal elements , and
.
This threshold vector may be not optimal but leads to smaller total costs on the data set than
the default.
ii. Empirical thresholding
As in the binary case it is possible to tune the threshold vector using function tuneThreshold. Since the scaling of the threshold vector does not change the predicted class labels tuneThreshold returns threshold values that lie in [0,1] and sum to unity.
tune.res = tuneThreshold(pred = r$pred, measure = wf.costs)
tune.res
#> $th
#> 1 2 3
#> 0.03481266 0.32529865 0.63988869
#>
#> $perf
#> [1] 4.711613
For comparison we show the standardized version of the theoretically motivated threshold vector chosen above.
th/sum(th)
#> 1 2 3
#> 0.05172414 0.63218391 0.31609195
2. Rebalancing
i. Weighting
In the multiclass case you have to pass a vector of weights as long as the number of classes to function makeWeightedClassesWrapper. The weight vector can be tuned using function tuneParams.
lrn = makeLearner("classif.multinom", trace = FALSE)
lrn = makeWeightedClassesWrapper(lrn)
ps = makeParamSet(makeNumericVectorParam("wcw.weight", len = 3, lower = 0, upper = 1))
ctrl = makeTuneControlRandom()
tune.res = tuneParams(lrn, wf.task, resampling = rin, par.set = ps,
measures = list(wf.costs, mmce), control = ctrl, show.info = FALSE)
#> Resampling: crossvalidation
#> Measures: wf.costs mmce
#> Resampling: crossvalidation
#> Measures: wf.costs mmce
#> Resampling: crossvalidation
#> Measures: wf.costs mmce
#> Resampling: crossvalidation
#> Measures: wf.costs mmce
#> Resampling: crossvalidation
#> Measures: wf.costs mmce
#> Resampling: crossvalidation
#> Measures: wf.costs mmce
#> Resampling: crossvalidation
#> Measures: wf.costs mmce
#> Resampling: crossvalidation
#> Measures: wf.costs mmce
#> Resampling: crossvalidation
#> Measures: wf.costs mmce
#> Resampling: crossvalidation
#> Measures: wf.costs mmce
#> Resampling: crossvalidation
#> Measures: wf.costs mmce
#> Resampling: crossvalidation
#> Measures: wf.costs mmce
#> Resampling: crossvalidation
#> Measures: wf.costs mmce
#> Resampling: crossvalidation
#> Measures: wf.costs mmce
#> Resampling: crossvalidation
#> Measures: wf.costs mmce
#> Resampling: crossvalidation
#> Measures: wf.costs mmce
#> Resampling: crossvalidation
#> Measures: wf.costs mmce
#> Resampling: crossvalidation
#> Measures: wf.costs mmce
#> Resampling: crossvalidation
#> Measures: wf.costs mmce
#> Resampling: crossvalidation
#> Measures: wf.costs mmce
#> Resampling: crossvalidation
#> Measures: wf.costs mmce
#> Resampling: crossvalidation
#> Measures: wf.costs mmce
#> Resampling: crossvalidation
#> Measures: wf.costs mmce
#> Resampling: crossvalidation
#> Measures: wf.costs mmce
#> Resampling: crossvalidation
#> Measures: wf.costs mmce
#> Resampling: crossvalidation
#> Measures: wf.costs mmce
#> Resampling: crossvalidation
#> Measures: wf.costs mmce
#> Resampling: crossvalidation
#> Measures: wf.costs mmce
#> Resampling: crossvalidation
#> Measures: wf.costs mmce
#> Resampling: crossvalidation
#> Measures: wf.costs mmce
#> Resampling: crossvalidation
#> Measures: wf.costs mmce
#> Resampling: crossvalidation
#> Measures: wf.costs mmce
#> Resampling: crossvalidation
#> Measures: wf.costs mmce
#> Resampling: crossvalidation
#> Measures: wf.costs mmce
#> Resampling: crossvalidation
#> Measures: wf.costs mmce
#> Resampling: crossvalidation
#> Measures: wf.costs mmce
#> Resampling: crossvalidation
#> Measures: wf.costs mmce
#> Resampling: crossvalidation
#> Measures: wf.costs mmce
#> Resampling: crossvalidation
#> Measures: wf.costs mmce
#> Resampling: crossvalidation
#> Measures: wf.costs mmce
#> Resampling: crossvalidation
#> Measures: wf.costs mmce
#> Resampling: crossvalidation
#> Measures: wf.costs mmce
#> Resampling: crossvalidation
#> Measures: wf.costs mmce
#> Resampling: crossvalidation
#> Measures: wf.costs mmce
#> Resampling: crossvalidation
#> Measures: wf.costs mmce
#> Resampling: crossvalidation
#> Measures: wf.costs mmce
#> Resampling: crossvalidation
#> Measures: wf.costs mmce
#> Resampling: crossvalidation
#> Measures: wf.costs mmce
#> Resampling: crossvalidation
#> Measures: wf.costs mmce
#> Resampling: crossvalidation
#> Measures: wf.costs mmce
#> Resampling: crossvalidation
#> Measures: wf.costs mmce
#> Resampling: crossvalidation
#> Measures: wf.costs mmce
#> Resampling: crossvalidation
#> Measures: wf.costs mmce
#> Resampling: crossvalidation
#> Measures: wf.costs mmce
#> Resampling: crossvalidation
#> Measures: wf.costs mmce
#> Resampling: crossvalidation
#> Measures: wf.costs mmce
#> Resampling: crossvalidation
#> Measures: wf.costs mmce
#> Resampling: crossvalidation
#> Measures: wf.costs mmce
#> Resampling: crossvalidation
#> Measures: wf.costs mmce
#> Resampling: crossvalidation
#> Measures: wf.costs mmce
#> Resampling: crossvalidation
#> Measures: wf.costs mmce
#> Resampling: crossvalidation
#> Measures: wf.costs mmce
#> Resampling: crossvalidation
#> Measures: wf.costs mmce
#> Resampling: crossvalidation
#> Measures: wf.costs mmce
#> Resampling: crossvalidation
#> Measures: wf.costs mmce
#> Resampling: crossvalidation
#> Measures: wf.costs mmce
#> Resampling: crossvalidation
#> Measures: wf.costs mmce
#> Resampling: crossvalidation
#> Measures: wf.costs mmce
#> Resampling: crossvalidation
#> Measures: wf.costs mmce
#> Resampling: crossvalidation
#> Measures: wf.costs mmce
#> Resampling: crossvalidation
#> Measures: wf.costs mmce
#> Resampling: crossvalidation
#> Measures: wf.costs mmce
#> Resampling: crossvalidation
#> Measures: wf.costs mmce
#> Resampling: crossvalidation
#> Measures: wf.costs mmce
#> Resampling: crossvalidation
#> Measures: wf.costs mmce
#> Resampling: crossvalidation
#> Measures: wf.costs mmce
#> Resampling: crossvalidation
#> Measures: wf.costs mmce
#> Resampling: crossvalidation
#> Measures: wf.costs mmce
#> Resampling: crossvalidation
#> Measures: wf.costs mmce
#> Resampling: crossvalidation
#> Measures: wf.costs mmce
#> Resampling: crossvalidation
#> Measures: wf.costs mmce
#> Resampling: crossvalidation
#> Measures: wf.costs mmce
#> Resampling: crossvalidation
#> Measures: wf.costs mmce
#> Resampling: crossvalidation
#> Measures: wf.costs mmce
#> Resampling: crossvalidation
#> Measures: wf.costs mmce
#> Resampling: crossvalidation
#> Measures: wf.costs mmce
#> Resampling: crossvalidation
#> Measures: wf.costs mmce
#> Resampling: crossvalidation
#> Measures: wf.costs mmce
#> Resampling: crossvalidation
#> Measures: wf.costs mmce
#> Resampling: crossvalidation
#> Measures: wf.costs mmce
#> Resampling: crossvalidation
#> Measures: wf.costs mmce
#> Resampling: crossvalidation
#> Measures: wf.costs mmce
#> Resampling: crossvalidation
#> Measures: wf.costs mmce
#> Resampling: crossvalidation
#> Measures: wf.costs mmce
#> Resampling: crossvalidation
#> Measures: wf.costs mmce
#> Resampling: crossvalidation
#> Measures: wf.costs mmce
#> Resampling: crossvalidation
#> Measures: wf.costs mmce
#> Resampling: crossvalidation
#> Measures: wf.costs mmce
#> Resampling: crossvalidation
#> Measures: wf.costs mmce
#> Resampling: crossvalidation
#> Measures: wf.costs mmce
tune.res
#> Tune result:
#> Op. pars: wcw.weight=0.673,0.0513...
#> wf.costs.test.mean=3.2633889,mmce.test.mean=0.2179496
Exampledependent misclassification costs
In case of exampledependent costs we have to create a special Task via function
makeCostSensTask.
For this purpose the feature values and an
cost
matrix that contains
the cost vectors for all examples in the data set are required.
We use the iris data and generate an artificial cost matrix (see Beygelzimer et al., 2005).
df = iris
cost = matrix(runif(150 * 3, 0, 2000), 150) * (1  diag(3))[df$Species,] + runif(150, 0, 10)
colnames(cost) = levels(iris$Species)
rownames(cost) = rownames(iris)
df$Species = NULL
costsens.task = makeCostSensTask(id = "iris", data = df, cost = cost)
costsens.task
#> Supervised task: iris
#> Type: costsens
#> Observations: 150
#> Features:
#> numerics factors ordered
#> 4 0 0
#> Missings: FALSE
#> Has blocking: FALSE
#> Classes: 3
#> setosa, versicolor, virginica
mlr provides several wrappers to turn regular classification or regression methods into Learners that can deal with exampledependent costs.
 makeCostSensClassifWrapper (wraps a classification Learner): This is a naive approach where the costs are coerced into class labels by choosing the class label with minimum cost for each example. Then a regular classification method is used.
 makeCostSensRegrWrapper (wraps a regression Learner): An individual regression model is fitted for the costs of each class. In the prediction step first the costs are predicted for all classes and then the class with the lowest predicted costs is selected.
 makeCostSensWeightedPairsWrapper (wraps a classification Learner): This is also known as costsensitive onevsone (CSOVO) and the most sophisticated of the currently supported methods. For each pair of classes, a binary classifier is fitted. For each observation the class label is defined as the element of the pair with minimal costs. During fitting, the observations are weighted with the absolute difference in costs. Prediction is performed by simple voting.
In the following example we use the third method. We create the wrapped Learner and train it on the CostSensTask defined above.
lrn = makeLearner("classif.multinom", trace = FALSE)
lrn = makeCostSensWeightedPairsWrapper(lrn)
lrn
#> Learner costsens.classif.multinom from package nnet
#> Type: costsens
#> Name: ; Short name:
#> Class: CostSensWeightedPairsWrapper
#> Properties: twoclass,multiclass,numerics,factors
#> PredictType: response
#> Hyperparameters: trace=FALSE
mod = train(lrn, costsens.task)
mod
#> Model for learner.id=costsens.classif.multinom; learner.class=CostSensWeightedPairsWrapper
#> Trained on: task.id = iris; obs = 150; features = 4
#> Hyperparameters: trace=FALSE
The models corresponding to the individual pairs can be accessed by function getLearnerModel.
getLearnerModel(mod)
#> [[1]]
#> Model for learner.id=classif.multinom; learner.class=classif.multinom
#> Trained on: task.id = feats; obs = 150; features = 4
#> Hyperparameters: trace=FALSE
#>
#> [[2]]
#> Model for learner.id=classif.multinom; learner.class=classif.multinom
#> Trained on: task.id = feats; obs = 150; features = 4
#> Hyperparameters: trace=FALSE
#>
#> [[3]]
#> Model for learner.id=classif.multinom; learner.class=classif.multinom
#> Trained on: task.id = feats; obs = 150; features = 4
#> Hyperparameters: trace=FALSE
mlr provides some performance measures for examplespecific costsensitive classification. In the following example we calculate the mean costs of the predicted class labels (meancosts) and the misclassification penalty (mcp). The latter measure is the average difference between the costs caused by the predicted class labels, i.e., meancosts, and the costs resulting from choosing the class with lowest cost for each observation. In order to compute these measures the costs for the test observations are required and therefore the Task has to be passed to performance.
pred = predict(mod, task = costsens.task)
pred
#> Prediction: 150 observations
#> predict.type: response
#> threshold:
#> time: 0.04
#> id response
#> 1 1 setosa
#> 2 2 setosa
#> 3 3 setosa
#> 4 4 setosa
#> 5 5 setosa
#> 6 6 setosa
#> ... (150 rows, 2 cols)
performance(pred, measures = list(meancosts, mcp), task = costsens.task)
#> meancosts mcp
#> 151.0839 146.2973