For more information see: Stochastic Gradient Descent with Warm Restarts: https://arxiv.org/abs/1608.03983.

cb_es(monitor = "val_loss", patience = 3L)

cb_lr_scheduler_cosine_anneal(
  eta_max = 0.01,
  T_max = 10,
  T_mult = 2,
  M_mult = 1,
  eta_min = 0
)

cb_lr_scheduler_exponential_decay()

cb_tensorboard()

cb_lr_log()

Arguments

monitor

character
Quantity to be monitored.

patience

integer
Number of iterations without improvement to wait before stopping.

eta_max

numeric
Max learning rate.

T_max

integer
Reset learning rate every T_max epochs. Default 10.

T_mult

integer
Multiply T_max by T_mult every T_max iterations. Default 2.

M_mult

numeric
Decay learning rate by factor 'M_mult' after each learning rate reset.

eta_min

numeric
Minimal learning rate.

Details

Closed form: \(\eta_t = \eta_{min} + \frac{1}{2}(\eta_{max} - \eta_{min})(1 + \cos(\frac{T_{cur}}{T_{max}}\pi))\)