R/LearnerKerasShapedMLP.R
LearnerClassifShapedMLP.Rd
Shaped MLP as used in Zimmer et al. Auto Pytorch Tabular (2020) and proposed by https://mikkokotila.github.io/slate. Implements 'Search Space 1' from Zimmer et al. Auto Pytorch Tabular (2020) (https://arxiv.org/abs/2006.13799)
This learner builds and compiles the keras model from the hyperparameters in param_set
,
and does not require a supplied and compiled model.
R6::R6Class()
inheriting from LearnerClassifKeras.
Calls keras::fit()
from package keras.
Layers are set up as follows:
The inputs are connected to a layer_dropout
, applying the input_dropout
.
Afterwards, each layer_dense()
is followed by a layer_activation
, and
depending on hyperparameters by a layer_batch_normalization
and or a
layer_dropout
depending on the architecture hyperparameters.
This is repeated length(layer_units)
times, i.e. one
'dense->activation->batchnorm->dropout' block is appended for each layer_unit
.
The last layer is either 'softmax' or 'sigmoid' for classification or
'linear' or 'sigmoid' for regression.
Parameters:
Most of the parameters can be obtained from the keras
documentation.
Some exceptions are documented here.
use_embedding
: A logical flag, should embeddings be used?
Either uses make_embedding
(if TRUE) or if set to FALSE model.matrix(~. - 1, data)
to convert factor, logical and ordered factors into numeric features.
n_layers
: An integer defining the number of layers of the shaped MLP.
n_max
: An integer, defining the (first layer) number of neurons. The number of neurons is halved
after each layer according to formula (1) in https://arxiv.org/abs/2006.13799.
initializer
: Weight and bias initializer.
"glorot_uniform" : initializer_glorot_uniform(seed) "glorot_normal" : initializer_glorot_normal(seed) "he_uniform" : initializer_he_uniform(seed) "..." : see `??keras::initializer`
optimizer
: Some optimizers and their arguments can be found below.
Inherits from tensorflow.python.keras.optimizer_v2
.
"sgd" : optimizer_sgd(lr, momentum, decay = decay), "rmsprop" : optimizer_rmsprop(lr, rho, decay = decay), "adagrad" : optimizer_adagrad(lr, decay = decay), "adam" : optimizer_adam(lr, beta_1, beta_2, decay = decay), "nadam" : optimizer_nadam(lr, beta_1, beta_2, schedule_decay = decay)
regularizer
: Regularizer for keras layers:
"l1" : regularizer_l1(l = 0.01) "l2" : regularizer_l2(l = 0.01) "l1_l2" : regularizer_l1_l2(l1 = 0.01, l2 = 0.01)
class_weights
: needs to be a named list of class-weights
for the different classes numbered from 0 to c-1 (for c classes).
Example: wts = c(0.5, 1) setNames(as.list(wts), seq_len(length(wts)) - 1)
callbacks
: A list of keras callbacks.
See ?callbacks
.
LearnerClassifShapedMLP$new() mlr3::mlr_learners$get("classif.smlp") mlr3::lrn("classif.smlp")
Keras Learners offer several methods for easy access to the stored models.
.$plot()
Plots the history, i.e. the train-validation loss during training.
.$save(file_path)
Dumps the model to a provided file_path in 'h5' format.
.$load_model_from_file(file_path)
Loads a model saved using saved
back into the learner.
The model needs to be saved separately when the learner is serialized.
In this case, the learner can be restored from this function.
Currently not implemented for 'TabNet'.
.$lr_find(task, epochs, lr_min, lr_max, batch_size)
Employ an implementation of the learning rate finder as popularized by
Jeremy Howard in fast.ai (http://course.fast.ai/) for the learner.
For more info on parameters, see find_lr
.
#> <LearnerClassifShapedMLP:classif.keras> #> * Model: - #> * Parameters: epochs=100, validation_split=0.3333, batch_size=128, #> callbacks=<list>, low_memory=FALSE, verbose=0, use_embedding=FALSE, #> embed_dropout=0, embed_size=<NULL>, n_max=128, n_layers=2, #> initializer=<keras.initializers.initializers_v2.GlorotUniform>, #> regularizer=<keras.regularizers.L1L2>, #> optimizer=<keras.optimizer_v2.gradient_descent.SGD>, activation=relu, #> use_batchnorm=FALSE, use_dropout=TRUE, dropout=0, input_dropout=0, #> loss=categorical_crossentropy, output_activation=softmax, #> metrics=accuracy #> * Packages: keras #> * Predict Type: response #> * Feature types: integer, numeric, factor, ordered #> * Properties: multiclass, twoclass# available parameters: learner$param_set$ids()#> [1] "epochs" "model" "class_weight" #> [4] "validation_split" "batch_size" "callbacks" #> [7] "low_memory" "verbose" "use_embedding" #> [10] "embed_dropout" "embed_size" "n_max" #> [13] "n_layers" "initializer" "regularizer" #> [16] "optimizer" "activation" "use_batchnorm" #> [19] "use_dropout" "dropout" "input_dropout" #> [22] "loss" "output_activation" "metrics"