Feed Forward Neural Network using Keras and Tensorflow. This learner builds and compiles the keras model from the hyperparameters in param_set, and does not require a supplied and compiled model.

Calls keras::fit() from package keras. Layers are set up as follows:

  • The inputs are connected to a layer_dropout, applying the input_dropout. Afterwards, each layer_dense() is followed by a layer_activation, and depending on hyperparameters by a layer_batch_normalization and or a layer_dropout depending on the architecture hyperparameters. This is repeated length(layer_units) times, i.e. one 'dense->activation->batchnorm->dropout' block is appended for each layer_unit. The last layer is either 'softmax' or 'sigmoid' for classification or 'linear' or 'sigmoid' for regression.

Parameters:
Most of the parameters can be obtained from the keras documentation. Some exceptions are documented here.

  • use_embedding: A logical flag, should embeddings be used? Either uses make_embedding (if TRUE) or if set to FALSE model.matrix(~. - 1, data) to convert factor, logical and ordered factors into numeric features.

  • layer_units: An integer vector storing the number of units in each consecutive layer. layer_units = c(32L, 32L, 32L) results in a 3 layer network with 32 neurons in each layer. Can be integer(0), in which case we fit a (multinomial) logistic regression model.

  • initializer: Weight and bias initializer.

    "glorot_uniform"  : initializer_glorot_uniform(seed)
    "glorot_normal"   : initializer_glorot_normal(seed)
    "he_uniform"      : initializer_he_uniform(seed)
    "..."             : see `??keras::initializer`
    
  • optimizer: Some optimizers and their arguments can be found below.
    Inherits from tensorflow.python.keras.optimizer_v2.

    "sgd"     : optimizer_sgd(lr, momentum, decay = decay),
    "rmsprop" : optimizer_rmsprop(lr, rho, decay = decay),
    "adagrad" : optimizer_adagrad(lr, decay = decay),
    "adam"    : optimizer_adam(lr, beta_1, beta_2, decay = decay),
    "nadam"   : optimizer_nadam(lr, beta_1, beta_2, schedule_decay = decay)
    
  • regularizer: Regularizer for keras layers:

    "l1"      : regularizer_l1(l = 0.01)
    "l2"      : regularizer_l2(l = 0.01)
    "l1_l2"   : regularizer_l1_l2(l1 = 0.01, l2 = 0.01)
    
  • class_weights: needs to be a named list of class-weights for the different classes numbered from 0 to c-1 (for c classes).

    Example:
    wts = c(0.5, 1)
    setNames(as.list(wts), seq_len(length(wts)) - 1)
    
  • callbacks: A list of keras callbacks. See ?callbacks.

Format

R6::R6Class() inheriting from LearnerClassifKeras.

Construction

LearnerClassifKerasFF$new()
mlr3::mlr_learners$get("classif.kerasff")
mlr3::lrn("classif.kerasff")

Learner Methods

Keras Learners offer several methods for easy access to the stored models.

  • .$plot()
    Plots the history, i.e. the train-validation loss during training.

  • .$save(file_path)
    Dumps the model to a provided file_path in 'h5' format.

  • .$load_model_from_file(file_path)
    Loads a model saved using saved back into the learner. The model needs to be saved separately when the learner is serialized. In this case, the learner can be restored from this function. Currently not implemented for 'TabNet'.

  • .$lr_find(task, epochs, lr_min, lr_max, batch_size)
    Employ an implementation of the learning rate finder as popularized by Jeremy Howard in fast.ai (http://course.fast.ai/) for the learner. For more info on parameters, see find_lr.

See also

Examples

learner = mlr3::lrn("classif.kerasff") print(learner)
#> <LearnerClassifKerasFF:classif.keras> #> * Model: - #> * Parameters: epochs=100, callbacks=<list>, validation_split=0.3333, #> batch_size=128, low_memory=FALSE, verbose=0, use_embedding=TRUE, #> embed_dropout=0, embed_size=<NULL>, activation=relu, #> layer_units=32,32,32, #> initializer=<keras.initializers.initializers_v2.GlorotUniform>, #> optimizer=<keras.optimizer_v2.adam.Adam>, #> regularizer=<keras.regularizers.L1L2>, use_batchnorm=FALSE, #> use_dropout=FALSE, dropout=0, input_dropout=0, #> loss=categorical_crossentropy, metrics=accuracy, #> output_activation=softmax #> * Packages: keras #> * Predict Type: response #> * Feature types: integer, numeric, factor, ordered #> * Properties: multiclass, twoclass
# available parameters: learner$param_set$ids()
#> [1] "epochs" "model" "class_weight" #> [4] "validation_split" "batch_size" "callbacks" #> [7] "low_memory" "verbose" "use_embedding" #> [10] "embed_dropout" "embed_size" "layer_units" #> [13] "initializer" "regularizer" "optimizer" #> [16] "activation" "use_batchnorm" "use_dropout" #> [19] "dropout" "input_dropout" "loss" #> [22] "output_activation" "metrics"